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I, Douglas L. Jones, the Declarant, hereby declare as follows: 



K I am currently a Professor of Electrical and Computer Engineering engaged in research in 
the area of acoustic signal processing and hearing aids. 



2. I have expertise in the fields of electronics, signal processing, acoustics, and hearing aids. 
My work experience, education, and other credentials in these fields are further 



documented as attached hereto in Exhibit A. 



3. I have reviewed U.S. Patent No, 09/193,058 filed November 16, 1998 (the "Subject 

Application"), and pending claims that have been or will be proposed as attached hereto 
in Exhibit B. (Collectively designated the "Patent Claims" hereinafter). 



4. I have also reviewed the United States Patent Office Action mailed 27 September 2004 
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(hereinafter the "Office Action") that rejects the Subject Application, and I have reviewed 
the references asserted in the Office Action, U.S. Patent No. 6,002,776 to Bhadkamkar, 
U.S. Patent No. 4,601,025 to Lea, and U.S. Patent No. 5,581,620 to Brandstein et al, all 
of which are attached in Exhibit C. 



5. The documents of Exhibits B and C are all directed to technologies in which I have 
expertise. 

6. Based on my review of Exhibits B and C, those skilled in the art would be discouraged 
from considering the combination of Bhadkamkar, Lea, and Brandstein as proposed in 
the Office Action. The operability and/or suitability for intended use would be 
undermined by the asserted combinations. Correspondingly, those skilled in the art 
would not reasonably expect success to result. For instance, the Office Action proposes 
to "incorporate an individual correlator system as taught by Lea as part of the DOA 
estimator of the system of Bhadkamkar." (Office Action, page 5). The Office Action 
indicates that an "individual correlator system" includes at least delay lines 42 and 44 and 
correlators 46. Correlators 46 each provide an output, three of which are specifically 
designated by d, e, and f. If this is the interface to separator 30 intended in the Office 
Action, using only some of the outputs or combining them in some manner would likely 
undermine operability of the proposed combination - or at the very least require undue 
experimentafion and/or significant undisclosed modifications to succeed. 



7. In another instance, the Office Action cites to column 4, lines 59-66 of Lea stating that 
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the "identification of a peak indication gives rise to the time differential between the 
angle of incidence of the received sound and the phase center of the two receiving 
elements." Based on a review of column 4, line 59 - column 5, line 10 of Lea, it does not 
appear to support this statement. It is surmised that the Office Action could be 
contending that the referenced time differentials of Lea are being interfaced with the 
separator 30 of Bhadkamkar - that is ti and/or ti rather than the correlator outputs (d, e, 0 
are provided to separator 30. However, as perhaps best shown in Fig. 4b of Lea, the 
derivation of ti and of t2 depends on the inputs a, b, and c from two other delay lines 51 
and 52, and associated correlators 53 in addition to inputs d, e, and f from correlators 46. 
Correspondingly, the inclusion of this additional hardware (four delay lines and 
numerous correlators 46 and 53) is inconsistent with the two input, two output DOA 
estimator 20 of Bhadkamkar. Moreover, it is questionable that separator 30 would 
properly work with ti and/or t2 as defined by Lea. Indeed, Lea depends on subtraction 
network 54 to provide a signal difference, which is needed to subsequently determine 
incidence angle with computer 55. Even with further speculative, undisclosed 
modifications, the prospect of success remains dubious. 

8. Based on my review of Exhibits B and C, there does not appear to be any teaching, 
suggestion, or disclosure of generation of a characteristic signal representative of the 
desired acoustic signal that occurs during performance of determining location of the 
second source; where the interfering signal is from this second source. 

9. Attached as Exhibit D is a copy of a publication in the Journal of the Acoustical Society 
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of America (hereinafter "JAS A"). JAS A is one of the pre-eminent scholarly publications 
in the field to which the Patent Claims pertain. Specifically, "A two-microphone dual 
delay-line approach for extraction of a speed sound in the presence of multiple 
interferers", JASA 110(6) (December 2001) (hereinafter the "JASA Article") describes 
the virtues of the inventions defined by at least claims 34, 46, 61, and 62 in connection 
with section II.A. Furthermore, the discussion in section 1. of the JASA Article describes 
the long-standing desire for a way to extract signals subject to the "cocktail party" effect. 
As described in the Subject Application, page 14, line 24 - page 15, line 32, the 
inventions defined by claims 34-35, 37-42, 44-48, 50-54, and 56-66 provide significant 
advances over the state of the art by providing source separation of as little as 2 degrees 
arithmetically and extracting a desired signal of lesser intensity than a close interferer to 
address this long-felt desire. 

10. Based on information and belief, the owner of all right, title and interest to the Subject 
Application, is the Board of Trustees of the University of Illinois (hereinafter the 
"Owner"). The Owner has licensed the subject patent application to Phonak AG by 
written agreement dated November 1, 2000. 
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11. The undersigned, being hereby warned that willful false statements and the like are 

punishable by fine or imprisonment, or both (18 U. S C. 1001), and may jeopardize the 
validity of the application or any patent issuing thereon, declares that all statements made 
of his/her own knowledge are true and that all statements made on information and belief 
are believed to be true. 
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Douglas L. Jones 



OFFICE ADDRESS HOME ADDRESS 

Coordinated Science Laboratory 1214 W, Church St, 

1308 W. Main Street Champaign, IL 61821 

University of Illinois (217) 359-2038 

Urbana, IL 61801 

(217) 244-6823 

(217) 244-1642 (FAX) 

dl-jones@uiuc.edu 



EDUCATION 

Rice University Ph.D. in Electrical and Computer Engineering, 1987 

MSEE in Electrical and Computer Engineering, 1985 
BSEE in Electrical Engineering (Summa Cum Laude), 1983 



AWARDS and HONORS 

Fellow, IEEE, 2002 

Connexions Author of the Year, 2003 

Fulbright Junior Research Fellowship, 1987-1988 

National Science Foundation Graduate Fellowship, 1983-1986 

Member of Phi Beta Kappa, Tau Beta Pi, and Eta Kappa Nu Honor Societies 



EXPERIENCE 

1988-Present University of Illinois at Urbana-Champaign 

Professor, Dept. of Electrical and Computer Engineering, 1998-Present 
Research Professor, Coordinated Science Laboratory 
Research Professor, Beckman Institute 
Associate Professor, 1993-1998 
Assistant Professor, 1988-1993 

Research and teaching in the area of signal and image processing. 



Spring 2002 University of California, Berkeley 

Visiting Scholar 
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Spring 1999 Rice University 

Texas Instruments Visiting Professor 



Summer 1998 University of Cambridge 

Participant, Programme on Nonlinear and Nonstationary Signal Processing 
Isaac Newton Institute for Mathematical Sciences 



Spring 1995 University of Washington 

Visiting Scientist 



1987-1988 Universitat Erlangen-Niirnberg, Germany 

Fulbright Postdoctoral Research Fellow 



1984-Present Consultant 

Consultant for several firms on the topics of magnetic field modeling, high- 
resolution NMR spectral analysis, optical time-domain reflectometry, sonar signal 
analysis, adaptive equalization, and electrocardiogram analysis, 

1983-1987 Rice University 

Research Assistant, Dept. of Electrical and Computer Engineering 
Research on time-frequency and time-varying signal analysis, and FFT algorithms. 
Also developed a signal processing laboratory course and wrote an associated text- 
book, published by Prentice-Hall. 



PROFESSIONAL ACTIVITIES 

iMember-at-large of the Board of Governors of the IEEE Signal Processing Society 2002-2004 

Co-Chairman NSF/ONR Workshop on Signal Processing in Manufacturing 

and Machine Monitoring, March 1996, Alexandria, Virginia 

Co-Chairman AUerton Conference on Communications, Control, 

and Computing, October 2000, October 2001 

Associate Editor IEEE Signal Processing Letters, 1997-1999 

Technical Program Comm. 1994 lEEE/SP Symp. on Time-Frequency and Time-Scale Analysi 

Panels NSF Research Initiation Award Panel, 1994 

NSF Small Business Innovative Research Panel, 1995 
NSF Signal Processing Systems Review Panel, 1998 
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Reviewer IEEE Transactions on Signal Processing 

IEEE Transactions on Information Theory 

IEEE Transactions on Biomedical Engineering 

IEEE Transactions on Aerospace and Electronic Systems 

IEEE Transactions on Circuits and Systems 

IEEE Signal Processing Letters 

IEEE Spectrum Magazine 

Proceedings of the IEEE 

Circuits, Systems, and Signal Processing 

Applied Signal Processing 

IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing 
lEEE/SP Symp. on Time-Frequency and Time-Scale Analysis 
Allerton Conference on Communications and Control 
Midwest Symposium on Circuits and Systems 
National Science Foundation 
University of Illinois Campus Research Board 
McGraw-Hill Book Company 
Prentice-Hall Publishers, Inc. 

Member IEEE Signal Processing Society 

IEEE Communications Society 



RECENT DEPARTMENTAL AND CAMPUS SERVICE 



Department ECE Curriculum Committee, 1990-1994, 1999-2001, 2002-Present . 

Beckman Institute Program Advisory Committee, 2000-2001, 2002-present 

Secretary, Curriculum Revision Subcommittee, 1993-1994 

Fellowship Committee, 1995-2001 

Computer Resources Committee, 1994-1995 

Chair, Computer Resources Committee, 1996-1997 

Chair, Circuits and Signal Processing Area Committee, 1995-present 

CEPS Industrial Affiliates Program member, 1989-present 

CEPS Rockwell Corp. Liaison, 1993-present 

CEPS Steering Committee, 1993-1994, 1996-present 

ECE 320 Course Director, 1989-present 



College College of Engineering Executive Committee, 1997-2000 

Associate Director, UI Motorola Center for Communications, 1997-present 

Campus University Senate, 1992-1994, 1996-1998, 1999-2001 
Senate Educational Policy Committee, 1993-1994 
Chair, Senate Subcommittee on Tuition Surcharge Policy, 1993-1994 
Senate Council Coord. Group on Instructional Resource Improvement and Funding, 1994 
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Senate Academic Calendar Committee, 1996-1998 



GRANTS AND CONTRACTS 

NSF Research Initiation Award MIP-8908777, entitled "Area-Efficient VLSI Implementation of 
Signal Processing Algorithms via Multiple Coefficient Receding," June 1989 through November 
1991, amount: $63,119. 

JSEP Contract N00014-89-C-0149 Unit 17, entitled "Adaptive Signal Processing," duration: Au- 
gust 1989 through July 1992, amount: $360,000, with Profs. H.V. Poor, W.K. Jenkins, and K.S. 
Arun. 

NSF MIP 90-12747, entitled "Modeling of Time- Varying Signals," duration: 9/15/90-2/29/93, 
amount: $148,694, with Prof. K.S. Arun. 

State of Illinois SCCA 91-82121, entitled "A Technology Transfer Initiative for ACT Signal Micro- 
Processor Technology," duration: 12/1/90-6/30/91, amount: $300,000, with Profs. W.K. Jenkins, 
D.L. Jones, D.C. Munson, Jr., Y. Bresler, N. Ahuja, and S. Franke. 

DARPA N00014-91-J-1844, entitled "A Technology Transfer Initiative for ACT Signal Micro- 
Processor Technology," duration: 6/15/91-3/14/92, amount: $150,000, with Profs. W.K. Jenkins, 
D.L. Jones, D.C. Munson, Jr., Y. Bresler, N. Ahuja, and S. Franke. 

JSEP Contract N00014-89-C-0149 Unit 19, entitled "Multi-Dimensional Adaptive Signal Process- 
ing," duration: October 1992 through September 1995, amount: $280,000, with Profs. W.K. Jenkins 
and M.T. Orchard. 

Naval Surface Warfare Center, N00167-93-M-3228, entitled "Energy Partitioning in the Time- 
Frequency Plane," duration: March 1993 through February 1994, amount: $24,935. 

Naval Surface Warfare Center, N000167-94-M-7133, entitled "Energy Partitioning in the Time- 
Frequency Plane," duration: March 1994 through February 1995, amount: $24,982. 

DARPA, University of Minnesota subcontract entitled "Scalable Library for Digital Signal Process- 
ing," duration: March 1994 through December 1996, amount: $102,889, with Profs. K. Gallivan, 
D. Munson, and M. Orchard. 

Center for Research on Applied Signal Processing, USC Subcontract PO#665433, entitled "Optimal 
and Adaptive Time-Frequency Methods for Detection and Estimation," duration: September 1994 
through August 1996, amount: $90,000. 

ONR AASERT Grant N00014-95-1-0907, entitled "Energy Partitioning Using Overdetermined Ba- 
sis Decompositions," duration: May 1995 through May 1998, amount: $90,356. 

ONR Contract N00014-95-1-0674, entitled "Adaptive and Optimal Time-Frequency Methods for 
Nonstationary Signals," duration: June 1995 through September 1998, amount: $210,000. 
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JSEP Contract N00014-96-1-0129 Unit 11, entitled '^Acquisition and Demodulation for Wireless 
Communications," duration: October 1995 through September 1998, amount: $360,000. with Profs. 
D.V. Sarwate and U. Madhow. 

Hewlett-Packard Equipment Gift No. 34748, entitled "Equipment Proposal for Communications 
Laboratories," duration: April 1997, amount: $71,980, with Profs. S. Pranke and G. Papen. 

NSF Grant no. MIP-9707742, entitled "Unified Algorithms and Architectures for Low-Power Wire- 
less Video," duration: September 1997 through August 2001, amount: S479,878, with Profs. K. 
Ramchandran and N. Shanbhag. 

ONR Contract N00014-95- 1-0674 (extension), entitled "Time-Frequency-Space Processing And 
Multi-Component Signal Classification" duration: February 1998 through September 2001, amount: 
$182,994. 

Texas Instruments DSP Equipment Gift for ECE DSP Laboratory, date: October 1998, amount: 
$17,955. 

PhysioControls contract, entitled "Noise-Resistant Cardiac Arrhythmia Detector," duration: Jan 
1998 through August 1999, amount: $23,325 

University of Illinois Mary Jane Neer Research Fund, "High- Performance Dual Channel DSP-Based 
Acoustic Processor for Use in Hearing Aids," duration: January 1998 through December 1999, 
amount: with Profs. R. Bilger, A. Feng C. Lansing, W. O'Brien, and B. Wheeler = 

NIH, National Cancer Institute Grant no. CA079179, entitled "/n Vivo Ultrasonic Microprobe for 
Tumor Diagnosis," duration: amount: $125,906 annual direct costs with Profs. W.D. O'Brien, Jr., 
J.F. Zachary, D.A. Payne, C. Liu 

NSF contract CCR-9979381, entitled "An Integrated Exploration of Wireless Network Communi- 
cation," duration: October 1999 through September 2001, amount: $466,636, with Profs. B. Hajek, 
R. Blahut, U. Madhow, and N. Shanbhag 

GRASP, entitled "Modulation Classification," duration: October 1999 through September 2000, 
amount: $36,000 

Texas Instruments DSP Equipment Gift for ECE DSP Laboratory, date: February 2000, amount: 
$42,985. 

Motorola Corp. contract, entitled "Joint Source-Channel Matching for Wireless Multimedia Com- 
munication," duration: January 2000 through August 2002, amount: $180,000, with Prof. N. 
Shanbhag 

NIH contract, entitled "Real Time Implementation of Intelligent Hearing Aids," duration: July 
2000 through June 2002, amount: $254,804, with Profs. B. Wheeler, A. Feng, W.D. O'Brien, C. 
Lansing, and R. Bilger 
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• \ fV Mirrosensor " duration: Sept. 
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PERSONAL 

Catherine A. Schmidt-Jones, three children. 



Born 



in Dallas, Texas on January 17, 1961. Married to 
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MAJOR RESEARCH ACOMPLISHMENTS 



Time-frequency and joint signal analysis: 

• Adaptive time- frequency representations. Developed the concept of adaptive time- 
frequency representations (TFRs), and several of the leading signal-dependent and adap- 
tive transforms. These include adaptive window short-time Fourier transforms, adaptive 
wavelet transforms, adaptive optimal kernels, adaptive cone kernels, and fast algorithms 
for their computation. Our recent "consistent time-frequency representation" achieves 
unprecedented resolution and cross-term suppression. 

• Statistical time-frequency analysis. Developed a fundamental and comprehensive 
theory of statistical time-frequency analysis. We derived optimal kernels for nonsta- 
tionary spectrum estimation and time- frequency estimation of TFRs of noisy or random 
signals. A theory of time- frequency detection has determined the class of detection prob- 
lems for which time-frequency-based detection is globally optimal, the optimal kernels 
for these classes, and efficient methods of implementing these detectors. These tech- 
niques have been applied in a number of applications, including machine-fault detection 
and diagnosis, microembolus detection, and ECG classification. 

• Time-frequency-space processing. Developed efficient, near-optimal time-frequency- 
space detectors and estimators for partially coherent arrays. Highly efficient, nearly 
optimal quadratic narrowband array detection algorithms have been an important by- 
product of this research. 

• Generalized joint signal representations. Contributed several advances in gener-: 
alized joint signal representations, including a time-frequency-based derivation of the 
chirplet transform, new orthogonal chirped bases, and unitarily transformed joint sig- 
nal representations. We showed the equivalence of Cohen's and Baraniuk's methods for 
constructing general joint signal representations and contributed new insights to this 
theory. We extended the theory of adaptive and statistically optimal TFRs to gener- 
ahzed joint signal representations. We have developed four-parameter joint quadratic 
time-frequency-delay-doppler representations for applications such as improved adaptive 
time-frequency analysis and detection and classification. 

• Nonstationary blind source separation and interference cancellation. In- 
troduced new adaptive methods for blind source separation of nonstationary signals. 
These methods are simpler than existing methods and continuously track environmental 
changes. New methods for extraction of speech signals in cluttered environments (the 
"cocktail party" environment) have yielded great improvements over existing methods 
(see expanded discussion below). 

Wavelet techniques and applications: 

• Denoising, Innovative new methods for denoising multichannel data provide much bet- 
ter performance than single-channel methods and are very efficient; appUcations to hy- 
perspectral imagery have shown more than 10 dB SNR gain. Derived worst-case bounds 
for the performance of denoising methods for both orthonormal bases and overdeter- 
mined frames. New "Bayesian pursuit" methods offer improve denoising performance 
for using overdetermined frames and a hierarchical statistical signal model. 
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• Generalized wavelet decompositions and transforms. Developed new orthogonal 
chirped wavelet bases, unitarily transformed basis decompositions, and efficient algo- 
rithms and implementations. Developed the chirplet transform from a time-frequency 
context. 

• MEMS sensors. In a joint project with Profs. Chang Liu and Naresh Shanbhag, we are 
developing distributed, multi-element touch and flow sensors and embedded architectures 
and algorithms for extracting sophisticated touch and flow information, such as texture, 
turbulence, softness, and slippage to create an "artificial skin." In collaboration with 
Prof Ron Miles at SUNY Binghamton, we are developing algorithms for high-accuracy 
direction-finding and signal recovery using colocated arrays of directional MEMS sensors. 

Adaptive signal processing: 

• Blind equalization. Developed a vector constant modulus algorithm for blind equal- 
ization of shaped channels, the first practical solution to this problem. 

• Nonlinear adaptive filters. Developed a low-complexity, LMS-like algorithm for gen- 
eral systems with a memoryless nonlinearity. Application to nonlinear echo cancellation 
demonstrated substantial improvement over 

• Nonstationary blind source separation. Introduced new adaptive methods for bhnd 
source separation of nonstationary signals, including both instantaneous and convolutive 
(dynamic) mixtures. These methods are simpler than existing methods and continuously 
track environmeintal changes. Research continues on faster algorithms. 

• Nonstationary adaptive beamforming. Developed a frequency-domain minimum- 
variance distortionless-response beamformer for small arrays with unprecedented perfor- 
mance in the recovery of speech in nonstationary interference. 

• Algorithms. Developed reduced-complexity and reduced-delay implementations for 
adaptive filters. Analyzed the transpose-form implementation for FIR adaptive filters 
and demonstrated its advantages for high-speed pipelined implementation. 

Biomedical applications: 

• Binaural and directional hearing aids. A multidisciplinary group based in the 
Beckman Institute has developed new binaural algorithms for the extraction of a de- 
sired source from a cluttered acoustic environment, such as in a restaurant or cocktail 
party. The new methods show remarkable performance improvements over conventional 
techniques in such environments, and have been implemented in a real-time DSP-based 
system. A new array technology using combinations of directional and omni microphones 
shows great promise for obtaining higher directivity in small (e.g., BTE) packages. Major 
research efforts toward commerciaHzation for advanced hearing aids and other acoustic 
extraction applications such as hands-free telephony, automotive and military applica- 
tions, and noise suppression for speech recognition systems continue. 

• Electrocardiogram analysis. Developed improved methods for denoising electrocar- 
diograms using adaptive time-frequency processing, 

• Microembolus detection. Developed wavelet-based and chirped wavelet detectors 
from ultrasound reflections from microemboli. Theoretical and experimental studies 
demonstrated that these methods approach optimal performance. 
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• Ultrasound image formation and analysis. Developed fast frequency-domain three- 
dimensional reconstruc tion algorithms for image reconstruction from measurements on 
a circular aperture. Applications include high-resolution imaging from ultrasound micro- 
probes on the end of a needle and from small ultrasound catheters. Developed methods 
for detecting edges and tissue boundaries in ultrasound images. 

In collaboration with Prof. W.D. O'Brien, developed new methods for aberration correc- 
tion in ultrasound imaging that perform much better than existing approaches in severe 
aberration. 

• Edge detection in ultrasound images. Developed near-optimal, computationally 
efficient methods for detecting boundary segments in ultrasound speckle images. 

• fMRI Image Denoising. Are developing (In collaboration with Prof. Farzad Ka- ' 
malabadi and Dr. Keith Thulborn at UIC) efficient methods for blind removal of noise 
from functional MRI image sequences. These techniques will allow precise imaging at 
much faster rates by greatly reducing the necessary averaging time to construct low-noise 
functional images. 

Telecommunications and other applications: 

• Peak Power Reduction for OFDM systems. Developing new methods provid- 
ing unprecedented peak-to-average power ratio (PAR) reduction for large-constellation / : 

; - OFDM systems. These methods are based on novel constellation-shaping approaches, 

and obtain these reductions with no loss of data rate or increase in symbol error rate.: ; U;.; 
Waveform-modification methods that offer more modest reductions but are compatible 
with current standards have also been developed. Extension of these ideas to peak power 
reduction in optical and CDMA communication systems continues. 

• Optimal discrete multi-tone (DMT) power allocation. Developed the first fast, 
exactly optimal algorithm for power allocation in discrete multi-tone modulation. 

• Instantaneous frequency estimation/FM demodulation. Developed an adaptive 
TFR-based IF estimator for FM demodulation that lowers the SNR threshhold by 3-4 
dB over existing methods. 

• Joint source-channel matching. In collaboration with Profs. Shanbhag and Ram- 
chandran, developed joint source-channel coding methods for wireless image and video 
transmission. These general methods allow near-optimal matching of most source and 
channel codes, as well as on-line adaptation to time-varying channels. Techniques that 
minimize the total system power have also been developed. Developed optimal meth- 
ods for multi-level broadcasting, joint source-network coding for video transmission over 
wireless networks and the internet, and total system optimization for distributed wireless 
sensor networks. 

• Wireless communication for binaural hearing aids. Developing new methods for 
low-power wireless communication for binaural hearing aids and other near-the-body 
applications. 

• Nonstationary interference cancellation. Developed new frame-based methods for 
blind removal of nonstationary interference from direct-sequence spread-spectrum com- 
munications signals. 
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• Stochastic sensor networks. Developing a very simple, robust, low-power sensor 
network approach and protocol for large networks of sensors. Each sensor controls power 
by operating, independently of the others, on a low duty cycle. Recent results prove 
that such a network can operate reliably with very higli probability and can perform ail 
network functions without any coordination of wake/sleep cycle between the nodes. 

Low-Power Computer Systems: 

• Global Resource Allocation through Cooperation (GRACE). In collaboration 
with Profs. S. Adve, R. Kravets, and K. Nahrstedt in the Computer Science Department, 
we are developing a new computing framework allowing joint, cooperative adaptation of 
the hardware, networking, operating system, and media application software to jointly 
minimize the total energy consumption in power-limited, mobile, general-purpose com- 
puters. 

Signal processing algorithms and implementations: 

• FFTs and Hartley transforms. Performed the first accurate analysis of the compu- 
tational complexity of the Hartley transform, showing conclusively that it is virtually 
equivalent to the real-valued FFT. Co-authored two heavily-cited papers on the Hartley 
transform and real-valued FFTs. 

• Joint hardware/algorithm design. Developed several new algorithms/architectures., 
for FFTs, FIR filters, and adaptive filters offering higher performance or reduced hard- 
ware complexity. 



MAJOR ACCOMPLISHMENTS IN TEACHING 

• Digital Signal Processing Laboratory Textbook. Developed the first textbook for a 
DSP-microprocessor-based laboratory course. This text spurred the development of hands-on 
DSP laboratory courses at many universities, and similar courses are now a mainstay of many 
electrical and computer engineering curricula in the U.S. and around the world. 

Completed the first open-source, on-line DSP laboratory textbook as part of the Connexions 
project (http://cnx.rice.edu). 

• ECE 320: Digital Signal Processing Laboratory. Introduced a digital signal processing 
laboratory at the University of Illinois in 1989. Innovations in both the content and the 
teaching methods keep this laboratory at the forefront of hands-on DSP education. During the 
next few years, this laboratory and the students will participate in a NSF-supported research 
project for the development of next-generation DSP compiler technology. Over the years, 
equipment donations from Texas Instruments, Motorola, and Analog Devices have equipped 
the instructional laboratory with state-of-the-art real-time DSP platforms. This course has 
become one of the most popular laboratory courses in the ECE department (capped at 60 
students per semester due to space limitations) and is now offered every semester. ECE 320 
has been designated a Texas Instruments Elite Laboratory. 
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ECE 210: Analog Signal Processing. Assisted in the development of the course outline 
and the laboratories for this innovative sophomore course, which replaces the traditional 
sophomore circuits course and the junior signals and systems course. Taught the first full- 
scale offering of ECE 210 (110-h students). This course is required for all EE and CE majors 
in the current undergraduate curricula at Illinois. 

Nonmajors course on Information Technology. With Prof. Michael Loui, developed 
a course, to be ECE 101, on information technology and engineering for non-engineering 
students. Digital information technology is introduced at several leve Is, including audio and 
video media, digital logic, and the Internet. The course is half lab-based, with students doing 
real engineering designs so that they also learn the process of technology development and 
the tradeoffs faced by enginee rs. The course satisfies the General Education requirements 
for non-science-and-engineering students. 

Curriculum development. Served as secretary of the subcommittee which drafted the 
new Electrical Engineering curriculum at the University of Illinois, and actively involved in 
current curricular revision. 

Laboratory enhancement. Secured a $71,980 equipment gift from Hewlett-Packard for 
equipping communications-related laboratories in the department. 
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PUBLICATIONS 



Books 

1. Jones, D.L. and T.W. Parks, A Digital Signal Processing Laboratory for the TMS32010, 
Prentice-Hall, Englewood Cliffs, New Jersey, 1988. 

2. Appadwedula, S., M.J. Berry, M. Butala, M.A. Haun, D.L. Jones, M.L. Kramer, D. Moussa, 
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Patent Claims 



L-33. (Canceled). 

34. (Previously presented) A method of signal processing, comprising: 

(a) detecting an acoustic excitation at both a first location to provide a 

corresponding first signal and at a second location to provide a corresponding second 
signal, the excitation being a composite of a desired acoustic signal from a first source 
and an interfering acoustic signal from a second source spaced apart from the first source; 

(b) determining location of the second source relative to the first source as a 
function of the first and second signals, which includes delaying each of the first and 
second signals by several time intervals to provide several delayed first signals and 
several delayed second signals and providing a time increment representative of 
separation of the first source from the second source; and 

(c) generating a characteristic signal representative of the desired acoustic 
signal during performance of said determining, the characteristic signal being a function 
of the time increment. 

35. (Previously presented) The method of claim 34, wherein the characteristic signal 
corresponds to spectral content of the desired acoustic signal and further comprising 
providing an output signal representative of the desired acoustic signal as a function of 
the characteristic signal. 

36. (Canceled). 
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37. (Currently amended) The method of claim 34, wherein said determining includes: 

establishing a signal pair, the signal pair having a first member from the delayed 
first signals and a second member from the delayed second signals, the characteristic 
signal being determined from the signal pair. 

38. (Previously presented) The method of claim 34, further comprising providing an 
output signal representative of the desired acoustic signal, and wherein the desired 
acoustic signal includes speech and the output signal is provided by a hearing aid device. 

39. (Currently amended) The method of claim 34, wherein said determining further 
includes: 

(bl) converting the first and second signals from an analog representation to a 
discrete representation; 

(b2) transforming the first and second signals from a time domain 
representation to a frequency domain representation; and 

(b3) establishing a signal pair representative of separation of the first source from 
the second source, the signal pair having a first member from the delayed first signals and 
a second member from the delayed second signals. 

40. (Currently amended) The method of claim 39, wherein the characteristic signal 
corresponds to a fraction with a numerator determined from at least the first and second 
members, and a denominator determined from at least the time increment. 
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41. (Previously presented) The method of claim 39, wherein said generating further 
includes: 

(cl) determining the characteristic signal from the signal pair and the first time 
increment, the characteristic signal being representative of spectral content of the desired 
acoustic signal; 

(c2) transforming the characteristic signal from a frequency domain 
representation to a time domain representation; and 

(c3) providing an audio output signal representative of the desired acoustic 
signal as a function of the characteristic signal. 

42. (Currently amended) The method of claim 41, further comprising establishing a 
further time increment corresponding to separation of the first source from the second 
source by comparing the delayed first and second signals, and 

wherein the time increment corresponds to a first phase difference, the further 
time increment corresponds to a second phase difference, and the characteristic signal 
includes a spectral representation determined from at least the first and second phase 
differences. 

43. (Canceled). 

44. (Previously presented) The method of claim 34, wherein separation of the second 
source is within five degrees of the first source relative to a zero degree azimuthal 

Patent Claim List 
Application No.: 09/193,058 
Inventors: Feng, et al. 
Filed: November 16. 1998 
Page 3 of 10 



22010-127/321877 

reference axis intersecting the first source and a midpoint situated between the first and 
second locations. 

45. (Previously presented) The method of claim 34, further comprising; 

(d) establishing a number of location signals each corresponding to a different 
location relative to the first source; and 

(e) selecting the characteristic signal from the location signals, the 
characteristic signal being representative of the location of the second source relative to 
the first source, the characteristic signal including a spectral representation of the desired 
acoustic signal. 

46. (Previously presented) A method of signal processing, comprising: 

(a) detecting an acoustic excitation at a first location to provide a 
corresponding first signal and at a second location to provide a corresponding second 
signal, the excitation being a composite of a desired acoustic signal from a first source 
and an interfering acoustic signal from a second source spaced apart from the first source; 

(b) localizing the second source relative to the first source as a function of the 
first and second signals, said localizing including establishing a number of location 
signals each corresponding to a different location relative to the first source, delaying 
each of the first and second signals by a number of time intervals to provide a number of 
delayed first signals and a number of delayed second signals, and establishing a signal 
pair that has a first member from the delayed first signals and a second member from the 
delayed second signals; and 
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(c) generating a characteristic signal from the location signals, wherein the 
characteristic signal includes a spectral representation of the desired acoustic signal from 
the first source, corresponds to position of the second source, and is determined from the 
signal pair. 

47. (Previously presented) The method of claim 46, further comprising providing an 
output signal representative of the desired acoustic signal as a function of the 
characteristic signal. 

48. (Currently amended) The method of claim 46, wherein said localizing includes: 

determining a time increment representative of separation of the first source from 
the second source, the characteristic signal being a function of the time increment. 

49. (Canceled). 

50. (Previously presented) The method of claim 46, further comprising providing an 
output signal representative of the desired acoustic signal, and wherein the desired 
acoustic signal includes speech and the output signal is provided by a hearing aid device. 

51. (Currently amended) The method of claim 46, wherein said localizing further 
includes: 

(bl) converting the first and second signals from an analog representation to a 
discrete representation; 
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(b2) transforming the first and second signals from a time domain 
representation to a frequency domain representation; and 

(b3) establishing a first time increment and a signal pair each representative of 
separation of the first source from the second source, the signal pair having a first 
member from the delayed first signals and a second member from the delayed second 
signals. 

52. (Previously presented) The method of claim 51, wherein the characteristic signal 
corresponds to a fraction with a numerator determined from at least the first and second 
members, and a denominator determined from at least the first time increment. 

53. (Previously presented) The method of claim 51, wherein said generating further 
includes: 

(cl) determining the characteristic signal from the signal pair and the first time 
increment; 

(c2) transforming the characteristic signal from a frequency domain 
representation to a time domain representation; and 

(c3) providing an audio output signal representative of the desired acoustic 
signal as a function of the characteristic signal. 

54. (Previously presented) The method of claim 53, further comprising establishing a 
second time increment corresponding to separation of the first source from the second 
source by comparing the delayed first signals and delayed second signals, and 
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wherein the first time increment corresponds to a first phase difference, the 
second time increment corresponds to a second phase difference, and the spectral 
representation of the characteristic signal is determined from at least the first and second 
phase differences. 

55. (Canceled). 

56. (Previously presented) The method of claim 1, wherein separation of the second 
source is within five degrees of the first source relative to a zero degree azimuthal 
reference axis intersecting the first source and a midpoint situated between the first and 
second locations. 

57. (New) The method of claim 34, wherein the characteristic signal corresponds to a 
fraction with a numerator determined from a difference between a first member of the 
delayed first signals and a second member of the delayed second signals, and a 
denominator determined from at least the time increment. 

58. (New) The method of claim 57, which includes providing the delayed first signals 
from a first multistage delay line and the delayed second signals from a second multistage 
delay line, the first member being output by a stage of the first delay line corresponding 
to the location of the second source and the second member being output by a stage of the 
second delay line corresponding to the location of the second source, and a different stage 
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of each of the first delay hne and the second delay line corresponding to location of the 
first source. 

59. (New) The method of claim 58, wherein the difference is representative of a 
minimized interfering acoustic signal level and provides the characteristic signal 
representative of spectral content of the desired acoustic signal. 

60. (New) The method of claim 46, wherein the generating includes determining the 
characteristic signal as a fraction with a numerator being a function of a difference 
between one of the delayed first signals and one of the delayed second signals, the 
difference being representative of a minimized interfering acoustic signal level, and the 
fraction having a denominator determined as a function of at least the first time 
increment. 

61. (New) A method of signal processing, comprising: 

detecting an acoustic excitation at both a first location to provide a corresponding first 
signal and at a second location to provide a corresponding second signal, the excitation being a 
composite of a desired acoustic signal from a first source and an interfering acoustic signal 
from a second source spaced apart from the first source; 

incrementally delaying the first signal to provide a number of delayed first signals and the 
second signal to provide a number of delayed second signals, a number of different pairings of 
the delayed first signals and the delayed second signals representing different locations; 
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localizing the second source relative to one of the different locations as a function of a 
difference between the members of a corresponding one of the different pairings; and 

generating a characteristic signal representative of spectral content of the desired acoustic 
signal based on the difference and a time increment corresponding to distance separating the 
first source and the second source. 

62. (New) A method of signal processing, comprising: 

detecting an acoustic excitation at both a first location to provide a corresponding first 
signal and at a second location to provide a corresponding second signal, the excitation being a 
composite of a desired acoustic signal from a first source and an interfering acoustic signal 
from a second source spaced apart from the first source; 

selecting the desired acoustic signal by positioning a reference axis relative to the first 
source; 

localizing the second source relative to the reference axis as a function of the first and 
second signals; and 

generating a characteristic signal representative of the desired acoustic signal during 
performance of said localizing. 

63. (New) The method of claim 62, which includes: 

defining the reference axis relative to the first location and the second location; and 
moving the reference axis to select a different acoustic signal. 
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64. (New) The method of claim 63, wherein the detecting the acoustic excitation is 
perforaied with a first sensor at the first location and a second sensor at the second location. 

65. (New) The method of claim 63, wherein the method is performed with a hearing aid. 

66. (New) The method of claim 63, wherein: 

the localizing includes establishing a number of delayed first signals each corresponding 
to a different one of a number of first delay stages of a first delay line and a number of delayed 
second signals each corresponding to a different one of a number of second delay stages of a 
second delay line; and 

the generating includes determining the characteristic signal as a function of a fraction 
with a numerator corresponding to a difference between one output of the first delay stages and 
one output of the second delay stages and a denominator corresponding to a time increment 
representative of a distance separating the first source and the second source. 
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DIRECTIONAL ACOUSTIC SIGNAL tion. Signals from any direction other than the look direction 

PROCESSOR AND METHOD THEREFOR are thus niisaligned in time. The signal in the output stream 

is formed, in essence, by "gating" sound fragments from the 

This application is submitted with a microfiche appendix, microphones. At any given instant, the output is chosen to be 

containing copyrighted material. Copyright 1994. Interval 5 equal to one of the microphone signals TTiese techniques. 

Research Corporation. The Appendix consists of one (1) fi**^ P"^',, "yJ^''^ ''I 

microfiche with forty-six (46) frames. TTie copyright owner i^^]^)- Lyo° (1983). perform best when the undes- 

u u- » f • 1 A r u (■ ired sources consist predominantly of impulse trains, as IS 

has no objecuon to the facsimile reproduction by anyone of ^.^ ^^^^^^ ^^^^ ^^^l^^^^ 

the patent document or the patent disclosure, as It appe^^^^^ „i ^„ be very computationally efficient and are of 

the Patent and Trademark Office patent file or records, but lo ^^.^'g^jg^ 

interest as models of human cocktail-party 

othenvise reserves all copyright nghts whatsoever in the processing, they do not have practical or commercial sig- 

appendices. nificance because of their inherent inability to bring about 

TECHNICAL FIELD suppression of unwanted sources. This inability origi- 
nates from the incorrect assumption that at every instant in 

This invention relates to the field of microphone-array time, at least one microphone contains only the desired 

signal processing, and more particularly lo a two stage signal. 

processor for extracting one substantially pure sound signal One widely known class of techniques in the prior art for 

from a mixture of such signals even in the presence of linear microphone -array processing is often referred to as 

echoes and reverberations. "classical beamforming" (Flanagan et aL, 1985). As with the 

20 nonlinear techniques mentioned above, processing begins 

BACKGROUND OF THE INVENTION with the removal of time-of-flight differences among the 

J, ■ „ , .u 4 u u • r *. microphone sianals with respect to the look direction. In 

It is well known that a human beme can focus attention on , r *i. « *• « i_ .u j i j u 

, - , . ^ ... place of the gatmg scheme, the delayed microphone 

a smgle source of sound even m an environment that Signals are simply avtraged together. Thus, any signal from 

contains many such sources. This phenomenon is often ^ ^^^^ direction is represented in the output with its 

called the * cocktail-party effect. ^^g^^^l p^^^^^ whereas signals from other directions are 

Considerable effort has been devoted in the prior art to relatively attenuated, 

solve the cocktail-party effect, both in physical devices and Classical beamforming was employed in a patented direc- 

in computational simulations of such devices. One prior tional hearing aid invented by Widrow and Brearley (1988). 

technique is to separate sound based on auditory scene The degree to which a classical beamformer is able to 

analysis. In this analysis, vigorous use is made of assump- attenuate undesired sources relative to the desired source is 

tions regarding the nature of the sources present. It is limited by (1) the number of microphones in the array, and 

assumed that a sound can be decomposed into small ele- (2) the spatial extent of the array relative to the longest 

ments such as tones and bursts, which in turn can be grouped wavelength of interest present in the undesired sources. In 

according to attributes such as harmonicity and continuity in particular, a classical beamformer cannot provide relative 

time. Auditory scene analysis can be performed using infor- attenuation of frequency components whose wavelengths 

mation from a single microphone or from several micro- are larger than the array. For example, an array one foot wide 

phones. For an early example of auditory scene analysis, see cannot greatly attenuate frequency components below 

Weintraub (1984, 1985, 1986). Other prior art work related approximately 1 kHz. . , . - 

to sound separation by auditory scene analysis are due to ^ ^^o known from the prior art is a class of acUve- 

Parsons (1976), von der Malsburg and Schneider (1986), cancel laUon algorithms, which is related to sound separa- 

Naylor and Porter (1991), and Mellinger (1991). |}°°- "^^^^^^^ " °f % ^' ^ ^'f^^ 

/, , . ..... , . . , . derived from only of one of the sources. For example, active 

Techmques mvolvmg auditory scene analysis, although noise ncanceUation techniques (see data sheets for Bose® 
mteresting from a scientific pomt of view as models of Aviation Headset, NCT proACTIVE® Series, and Sen- 
human auditory processing, are currently far too computa- 45 nheiser HDC451 Noiseguard® MobUe Headphone) reduce 
tionally demanding and specialized to be considered prac- the contribution of noise to a mixture by filtering a known 
tical techniques for sound separation until fundamental signal that contains only the noise, and subtracting it from 
progress is made. the mixtiu-e. Similarly, echo-cancellation techniques such as 
Other techniques for separating sounds operate by those employed in full-duplex modems (Kelly and Logan, 
exploiting the spatial separation of their sources. Devices 50 1970; Gritton and Lin, 1984) improve the signal-to-noise 
based on this principle vary in complexity. The simplest ratio of an outgoing signal by removing noise due to echoes 
such devices are microphones that have highly selective, but from the known incoming signal. 

fixed patterns of sensitivity. A directional microphone, for Techniques for active cancellation that do not require a 

example, is designed to have maximum sensitivity to soimds reference signal are called "blind." They are now classified, 

emanating from a particular direction, and can therefore be 55 based on the degree of realism of the underlying assump- 

used to enhance one audio source relative to others (see tions regarding the acoustic processes by which the 

Olson, 1967). Similarly, a close-talking microphone unwanted signals reach the microphones. To understand the 

mounted near a speaker's mouth rejects distant sources (see, practical significance of this classification, recall a feature 

for example, the Knowles CF 2949 data sheet). common to the principles by which active -cancellation 

Microphone-array processing techniques related to sepa- 60 techniques operate: the extent to which a given undesired 

rating sources by exploiting spatial separation of their source can be canceled by subtracting processed microphone 

sources are also well known and have been of interest for signals depends ultimately on the exactness with which 

several decades. In one early class of microphone-array copies of the undesired source in the different microphones 

techniques, nonlinear processing is employed. In each out- can be made to match one another. This depends, in turn, on 

put stream, some source direction of arrival, a "look 65 how accurately the signal processing models the acoustic 

direction," is assumed. The microphone signals are delayed processes by which the unwanted signals reach the micro- 

to remove differences in time of flight from the look direc- phones. 
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One class of blind active-cancellation techniques may be tion in an office. Second, the mixtures used in the simula- 

called "gain-based": it is presumed that the waveform pro- lions were partially separated to begin with, i.e., the 

duced by each source is received by the microphones crosstalk between the channels was weak. In practice, the 

simultaneously, but with varying relative gains. (Directional microphone signals must be assumed to contain strong 

microphones must be employed to produce the required 5 crosstalk unless the microphones are highly directional and 

differences in gain.) Thus, a gain-based system attempts to ^jj^ geometry of the sources is constrained, 

cancel copies of an undesired source in different microphoiie overcome some of the limitations of the convolutive 

signak by applying relative gams to the microphone s^^^ active-cancellation techniques named above, the present 

and subtractmg, but never applymg Ume delays or otherwise ^ architecture. Its two-stage 

filtering. Numerous gam-based methods for blmd ac ive ^^^^^^^^^^^ is^ubstantially different from other two-stage 

cancellaUon have been proposed; see Herault and Jutten . . ^ - , . . 

(1986). Bhatti and Bibyk (1991), Cohen (1991), Tongetal. architectures found n pnor art. . , ^. 

(1991 , and Molgedey and Schuster (1994). Atwo-stage signal processing architecture is employedm 

•me assumption of simultaneity is violated when micro- t Griffiths-Jim beamformer (Griffiths and Jim. 1982). The 
phones are Separated in space. A class of blind active- fi^t stage of a Gnffiths-Jm beamformer is delay-based: two 
cancellation tediniques that can cope with non-simultaneous mKrophone signals are delayed to remove Ume-of-fl.ght 
mixtures may be called "delay-based": it is assumed that the differences with respect to a given look duection and m 
waveform produced by each source is received by the ^ftf^' classical beamformmg, Uiese delayed micro- 
various miCTophones with relative time delays, but without Pf*""^ fg°^ subtracted to «eate a reference no|se 
any other filtering. (See Morita. 1991 and Bar-Ness. 1993.) s>g°al- 1° a separate channel, the delayed microphone sig- 
Underanechoic conditions, this assumption holds true for a '^'^ added, as m classical beamformmg, to create a 
microphone array that consists of omnidirectional micro- signalm which the desu-ed soiree js enhanced relative to the 
phones. However thissimple model of acousticpropagation no.^- fi«» ^"S^ °f .» Griffit^Jim beainformer 
ftomthesourcestothemicrophonesisviolatedwhenechoes P"duces a reference noise si^al and a signal that is 
and reverberation are present. ^5 predominantly desired source. The noise reference is then 

.J . employed m the second stage, using Standard active noise - 

When the signals mvolved are narrowband, some gam- . u * * ■ tu -^^i ,-^Hr^ 

7 .. fLfj 11 !• u cancellation techmques, to improve the signal-to-noise ratio 

based techniques for blmd active cancellation can be r^"^^^ out ut 

extended to employ complex gain coefficients (see Stnibe ' r a c .u « *u * 

(1981), Cardoso (1989,1991), Lacourae and Ruiz (1992), Th^ GnfBths-Jmi be^fonner suffers from Uie flaw that 

Comon et al. (1994)) and can therefore accommodate, to a 30 reverberant conditions, the delay-based first stage 

Umited degree, time delays due to microphone separation as cannot construct a reference noise signal devoid of the 

weU as echoes and reverberation. These techniques can be desired signal, whereas ±e second stage relies on the puri y 

adapted for use with audio signals, which are broadband, if that noise reference. If the noise reference is sufficiently 

the microphone signals are divided into narrowband com- contaminated with the desired source, the second stage 

ponents by means of a filter bank. The frequency bands 35 suppresses the desired source, not the noise (Van 

produced by the fiher bank can be processed independenUy, CompernoUe, 1990). Thus the Gnffiths-Jim beamformer 

and the results summed (for example, see Strube (1981) or incorrectly suppresses the desired signal under conditions 

the patent of Comon (1994)). However, they are computa- that are normally considered favorable: when the signal-to- 

tionally intensive relative to the present invention because of ^ alio m the microphones is high, 

the duplication of structures across frequency bands, are 40 Another two-stage architecture is described by Najaret al. 

slow to adapt in changing situations, are prone to sUtistical (1994). Its second stage employs blind convoliitive active 

error, and are extremely limited in their ability to accom- cancellation. However, its first stage differs significantly 

modate echoes and reverberation. from the first stage of the Griffiths-Jim beamformer. It 

The most realistic active-cancellation techniques cur- attempts to produce separated outputs by adaptively filtering 

rently known are "convolutive": the effect of acoustic propa- 45 each microphone signal in its own channel. When the 

gation from each source to each microphone is modeled as sources are spectrally similar, filters that produce partiaUy 

a convolutive filter. These techniques are more realistic than separated outputs after the first stage are unlikely to exist, 

gain-based and delay-based techmques because they explic- Thus, it is desirable to provide an architecture for sepa- 

itly accommodate the effects of inter-microphone ration of sources that avoids the difficulties exhibited by 

separation, echoes and reverberation. They are also more 50 existing techniques. 

r^ntSStSf''''""'*''^ SUMMARY OF THE INVENTION 

Convolutive active-cancellation techniques have recently An audio signal processing system for processing acoustic 
been described by Jutten et al. (1992), by Van CompernoUe waves from a plurality of sources, comprising a plurality of 
and Van Gerven (1992), by Piatt and Faggin (1992), and by 55 spaced apart transducer means for receiving acoustic waves 
Dine and Bar-Ness (1994). While these techniques have from the plurality of sources, including echoes and rever- 
been used to separate mixtures constructed by simulation berations thereof. The transducer means generates a plural- 
using oversimplified models of room acoustics, to the best of ity of acoustic signals in response thereto. Each of the 
our knowledge none of them has been applied successfully plurality of transducer means receives acoustic waves from 
to signals mixed in a real acoustic environment. The simu- 60 the plurality of sources including echoes and reverberations 
lated mixtures used by Jutten et al., by Piatt and Faggin, and thereof, and generates one of the plurality of acoustic 
by Dine and Bar-Ness differ from those that would arise in signals. A first processing means receives the plurality of 
a real room in two respects. First, the convolutive filters used acoustic signals and generates a plurality of first processed 
in the simulations are much shorter than those appropriate acoustic signals in response thereto. In the absence of echoes 
for modeling room acoustics; they allowed for significant 65 and reverberations of the acoustic waves from the plurality 
indirect propagation of sound over only one or two feet, of sources, each of the first processed acoustic signals 
compared with tens of feet typical of echoes and reverbera- represent acoustic waves from only one different source. A 
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second processing means receives the plurality of first rectional microphones were used, with a separation of 2 
processed acoustic signals and generates a plurality of centimeters. The pair was mounted at least 25 centimeters 
second processed acoustic signals in response thereto. In the from any large surface in order to preserve the omnidirec- 
presence of echoes and reverberations of the acoustic waves tional nature of the microphones. Matching was achieved by 
from the plurality of sources, each of the second processed 5 connecting the two microphone outputs to a stereo micro- 
acoustic signals represent acoustic waves from only one phone preamplifier and adjusting the individual channel 
different source. gains so that the preamplifier outputs were closely matched. 

The preamplifier outputs were each digitally sampled at 

BRIEF DESCRIPTION OF THE DRAWINGS 22,050 samples per second, simultaneously. These sampled 

no. 1 is a schematic block diagram of an embodiment of ^^^^^^^^^^ 1^°^^ ^f^^'^."^ to the direct signal 

an acoustic signal processor of the present invention, using ?^P^^^^°^ ^ ^^^^^^^^ ^^^^^ (^^^) 
two microphones. 

FIG. 2 is a schematic block diagram of an embodiment of The direct-signal separator 30 employs information from 

the direct-signal separator portion, i.e., the first stage of the " DOA estimator 20. which derives its estimate from the 

processor shown in FIG. 1. microphone signals. In a different embodiment of the 

, . , ...... , ... , invention, DOA information could come from an source 

FIG. 3 IS a schematic block diagram of an embodiment of ^^^^^ ^^^^ microphone signals, such as direct input from 

the crosstalk remover portion, i.e., the second stage of the ^ ^.^ ^^^^ 

processor shown in FIG. 1. , , ,• r 

. . r u J , • t The direct signal separator 30 generates a plurahty of 

FIG. 4 IS an overview of the delay in the acoustic waves 20 ^ ^ 43 direct signal separator 30 

amvmg at the direct signal separator portion of the signal ^ ^ 42 as there are 

proce^r of FIG. 1. and showing the separation of the ^i^^phones 10 and 12. generaUng as many input signals 14 

' and 16 as are supplied to the direct signal separator 30. 

FIG. 5 is an overview of a portion of the crosstalk remover Assuming that there are two sources, A and B. generating 

of the signal processor of FIG, 1 showing the removal of the 25 acoustic wave signals in the environment in which the signal 

crosstalk from one of the signal channels. processor 8 is located, then each of the microphones 10 and 

FIG. 6 is a detailed schematic block diagram of an 12 would detect acoustic waves from both sources. Hence, 

embodiment of a direct-signal separator using three micro- each of the electrical signals 14 and 16, generated by the 

phones. microphones 10 and 12, respectively, contains components 

FIG. 7 is a detailed schematic block diagram of an of sound from sources A and B. 

embodiment of a crosstalk remover suitable using three The direct-signal separator 30 processes the signals 14 

microphones. and 16 to generate the signals 40 and 42 respectively, such 

thai in anechoic conditions (i.e., the absence of echoes and 

DETAILED DESCRIPTION OF THE reverberations), each of the signals 40 and 42 would be of 

INVENTION electrical signal representation of sound from only one 

The present invention is a device that mimics the cocktail- source. In the absence of echoes and reverberations, the 

party effect using a plurality of microphones with as many electrical signal 40 would be of sound only from source A, 

output audio channels, and a signal-processing module. with electrical signal 42 being of sound only from source B, 

When situated in a complicated acoustic environment that ^ or vice versa. Thus, under anechoic conditions the direct- 

contains multiple audio sources with arbitrary spectral signal separator 30 can bring about full separation of the 

characteristics, it supplies output audio signals, each of sounds represented in signals 14 and 16. However, when 

which contains sound from at most one of the original echoes and reverberation are present, the separation is only 

sources. These separated audio signals can be used in a partial. 

variety of applications, such as hearing aids or voice- 45 The output signals 40 and 42 of the direct signal separator 

activated devices. 30 are suppHed to the crosstalk remover 50. The crosstalk 

FIG. 1 is a schematic diagram of a signal separator remover 50 removes the crosstalk between the signals 40 

processor of one embodiment of the present invention. As and 42 to bring about fully separated signals 60 and 62 

previously discussed, the signal separator processor of the respectively. Thus, the direct-signal separator 30 and the 

present invention can be used with any number of micro- 50 crosstalk remover 50 play complementary roles in the sys- 

phones. In the embodiment shown in FIG. 1, the signal tem 8. The direct -signal separator 30 is able to bring about 

separator processor receives signals from a first microphone full separation of signals mixed in the absence of echoes and 

10 and a second microphone 12, spaced apart by about two reverberation, but produces only partial separation when 

centimeters. As used herein, the microphones 10 and 12 echoes and reverberation are present. The crosstalk remover 

include transducers (not shown), their associated pre- 55 50 when used alone is often able to bring about full 

amplifiers (not shown), and A/D converters 22 and 24 separation of soxirces mixed in the presence of echoes and 

(shown in FIG. 2). reverberation, but is most effective when given inputs 40 and 

The microphones 10 and 12 in the preferred embodiment 42 that are partiaUy separated, 

are omnidirectional microphones, each of which is capable After some adaptation time, each output 60 and 62 of the 

of receiving acoustic wave signals from the environment and 60 crosstalk remover 50 contains the signal from only one 

for generating a first and a second acoustic electrical signal sound source: A or B. Optionally, these outputs 60 and 62 

14 and 16 respectively. The microphones 10 and 12 are can be connected individually to post filters 70 and 72, 

either selected or calibrated to have matching sensitivity. respectively, to remove known frequency coloration pro- 

Theuseof matched omnidirectional microphones 10 and 12, duced by the direct signal separator 30 or the crosstalk 

instead of directional or other microphones leads to sim- 65 remover 50. Practitioners skilled in the art will recognize 

plicily in the direct-signal separator 30, described below. In that there are many ways to remove this known frequency 

the preferred embodiment, two Knowles EM-3046 omnidi- coloration; these vary in terms of their cost and effective- 
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ness. An inexpensive post filtering method, for example. Schematically, the function of the direct signal separator 
consists of reducing the treble and boosting the base. 30 may be seen by referring to FIG. 4. Assuming that there 
The filters 70 and 72 generate output signals 80 and 82, are no echoes or reverberations, the acoustic wave signal 
respectively, which can be used in a variety of applications. received by the microphone 12 is the sum of source B and 
For example, they may be connected to a switch box and 5 a delayed copy of source A. (For clarity in presentation here 
then to a hearing aid. and in the forthcoming theory section, time relationship 
Referring to FIG. 2 there is shown one embodiment of the between the sources Aand B and the microphones 10 and 12 
direct signal separator 30 portion of the signal processor 8 of described as if the electrical signal 14 generated by the 
the present invention. The microphone transducers generate microphone 10 were simultaneous with source A and the 
input signals 11 and 13, which are sampled and digitized, by electrical signal 16 generated by the microphone 12 were 
clocked sample-and-hold circuits foUowed by analog-to- simultaneous with source B. This determines the two- 
digital converters 22 and 24, respectively, to produce arbitrary additive time constants that one is free to choose m 
sampled digital signals 14 and 16 respectively. each channel.) Thus, the electncal signal 16 generated by the 
rp. 1 . 1 1.1 • 1- J . c * J 1 1- -n I microphone 12 is an electrical representation of the sound 
The digital signal 14 IS supplied to a first delay hne 32. In « , , i j / * c - i i 
. r J u J- . ;u J 1 1- I-* J 1 4U 15 source B plus a delayed copy of source A, Smularly, the 
the preferred embodiment, the delay line 32 delays the i . ■ i • ^ ia . a\ *u • u m • 

, . - 1 1^ u ■ ♦ I 1 f *u electncal signal 14 generated by the microphone 10 is an 

digitally sampled signal 14 by a non-mtegral muhiple of the , . , ^ ^ ^? - . ^ . * j j i j 

sampling mtetval tT which was 45.35 microseco^s given electncalrep««entaUODof the sound sourcx A and a delayed 

the iampUng rale of 22.050 samples per second. The integral '=°Py °^ By delaying the electnca signal 14 

portion of the delay was implement^ using a digital delay "PP^opn^'f ^i^o"-"' ^^^^^ «gnal suppLed to the 

Une, while the remaining subsample delay of lest than one '° "''g'"'^ 3« would represent a delay«i 

, . , , . , * J • 1 * copy of source A plus a further delayed copy of source a. 

sample mterval was implemented using a non-causal, trun- ^'^^ . , , • , r » j f i- j 

. J . >ii • » o -c 11 * • 1 The subtraction of the signal from the delay line 32 and 

cated sine filter with 41 coeflScients. Specifically, to imple- j i - i ,j *i. - i 

, 1 J 1 r * • * . *i_ r 11 • digital signal 16 would remove the signal component rep- 

ment a subsample delay of t, given that t<T, the following j e j ai - i 

filter is used- resenting the delayed copy of sound source A, leaving only 

25 the pure soimd B (along with the further delayed copy of B). 
The amount of delay to be set for each of the digital delay 
lines 32 and 34 can be supplied from the DOA estimator 20. 
Numerous methods for estimating the relative time delays 
have been described in the prior art (for example, Schmidt, 

30 1981; Roy et al., 1988; Morita, 1991; AUen, 1991). Thus, the 

where x(n) is the signal to be delayed, y(n) is the delayed estimator 20 is weU known in the art. 

signal, and w(k) {k— 20, -19, . . . 19,20} are the 41 filter ^ different embodiment, omni-directional microphones 

coefficients. The filter coefficients are determined from the jq ^ ^^j^j ^ replaced by directional microphones 

subsample delay t as follows: ^^^^^^ ^^^y ^j^^e together. Then all delays would be 

w{k)'(ys) sinc{n[(f/r)-Jfc]} 35 replaced by multipliers; in particular, digital delay lines 32 

^^g^g and 34 would be replaced by multipliers. Each multiplier 

would receive the signal from its respective A/D converter 

siRcia) = sm(o)/a if a noi equal to 0 ^nd generate a scaled signal, which can be either positive or 

negative, in response. 

= I otherwise; 40 A preferred embodiment of the crosstalk remover 50 is 

shown in greater detail in FIG. 3. The crosstalk remover 50 
comprises a third combiner 56 for receiving the first output 

and S is a normalization factor given by signal 40 from the direct signal separator 30. The third 

combiner 56 also receives, at its negative input, the output 

^ 45 of a second adaptive filter 54. The output of the third 

s= Zj^^^^^iitfn-k]]. combiner 56 is supplied to a first adaptive filter 52. The 

output of the first adaptive filter 52 is supplied to the 
negative input of the fourth combiner 58, to which the 

The output of the first delay line 32 is supplied to the second output signal 42 from the direct signal separator 30 

negative input of a second combiner 38. The first digital so is also supplied. The outputs of the third and fourth com- 

signal 14 is also supplied to the positive input of a first biners 56 and 58 respectively, are the output signals 60 and 

combiner 36. Similarly, for the other channel, the second 62, respectively of the crosstalk remover 50. 

digital signal 16 is supplied to a second delay line 34, which Schematically, the function of the crosstalk remover 50 

generates a signal which is supplied to the negative input of may be seen by referring to FIG. 5. The inputs 40 and 42 to 

the first combiner 36. 55 the crosstalk remover 50 are the outputs of the direct-signal 

In the preferred embodiment, the sample-and-hold and separator 30. Let us assume that the direct-signal separator 

A/D operations were implemented by the audio input cir- 30 has become fully adapted, i.e., (a) that the electrical 

cuits of a Silicon Graphics Indy workstation, and the delay signal 40 represents the acoustic wave signals of source B 

fines and combiners were implemented in software running and its echoes and reverberation, plus echoes and reverbera- 

on the same machine. 60 tion of source A, and similarly (b) that the electrical signal 

However, other delay lines such as analog delay lines, 42 represents the acoustic wave signals of source A and its 

surface acoustic wave delays, digital low-pass filters, or echoes and reverberation, plus echoes and reverberation of 

digital delay lines with higher sampUng rates, may be used source B. Because the crosstalk remover 50 is a feedback 

in place of the digital delay line 32, and 34. Similarly, other network, it is easiest to analyze subject to the assumption 

combiners, such as analog voltage sub tractors using opera- 65 that adaptive filters 52 and 54 are fiilly adapted, so that the 

tional amplifiers, or special purpose digital hardware, may electrical signals 62 and 60 already correspond to colored 

be used in place of the combiners 36 and 38. versions of B and A, respectively. The processing of the 
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electrical signal 60 by the adaptive filter 52 will generate an yiW-^i(')-^2C'-T2)-H'i(/)-wi(f-(Ti+T2))+*u(')*>Vi(0+*i2WH'2(/) 
electrical signal equal to the echoes and reverberation of / x / . ^^ /^. /x. 

source B present in the eleclncal signal 42; hence subtrac- ^^2(0 ' * u/-n2vy 

tion of the output of adaptive filter 52 from the electrical 

signal 42 leaves output signal 62 with signal components 5 where the filters kii(t), kjjCt), kjiO), and k22(t) are related to 

only from source A. Similariy, the processing of the elec- kii'(0» ki2'(0» ^2i O)f ^22%^) t>y time shifts and linear 

trical signal 62 by the adaptive filter 54 will generate an combinations. Specifically, 
electrical signal equal to the echoes and reverberation of 

source A present in the electrical signal 40; hence subtrac- *u(')-*ii'(0-*2i'('-t2). 
tion of the output of adaptive filter 54 from the electrical 10 ^M'^iAO-^i'O-'^i)' 
signal 40 leaves output signal 60 with signal components 

only from source B. *2i(0"*2i'(0-*ii*('-'fi)» 
Theory 

It is assumed, solely for the purpose of designing the 

direct-signal separator 30, that the microphones 10 and 12 15 Note that yj(i) is contaminated by the term k^jO)* ^2(0* and 

are omnidirectional and matched in sensitivity. that y^O) is contaminated by the term k2i(l)*Wj(t). 

Under anechoic conditions, the signals x^{x) and XjO), Several possible forms of the crosstalk remover have been 

which correspond to the input signals, received by micro- described as part of the background of this invention, under 

phones 10 and 12, respectively, may be modeled as the heading of convolutive blind source separation. In the 

20 present embodiiment, the crosstalk remover forms discrete 

j:,(0-H',(/)+w2(f-x2) time sampled outputs 60 and 62 thus: 

X2(0-H'2(/)+H',(f-xJ, 

1000 

where Wj(t) and W2(t) are the original source signals, as they zi (n) = >i («) - ^ hi{k)z2 in - k) 

reach microphones 10 and 12, respectively, and Xj and are 25 

unknown relative time delays, each of which may be posi- 1000 

tive or negative . ^2 («) = >2 («) - ^ Ai {k)zi (n - k) 

Practitioners experienced in the art wilt recognize that 
bounded "negative" time delays can be achieved by adding 

a net time delay to the entire system. 30 ^^^^^ the discrete time filters h^ and h^ correspond to 

The relative tune delays x^ and x^ arc used to form outputs elements 52 and 54 in FIG. 3 and are estimated adaptively. 

yi(t) and y^W, which correspond to signals 40 and 42: xhe filters h^ and h^ are suictly causal, i.e., they operate only 

y {t)-x2{t~x^w {tyw (/-(t +T2)) °° P^^' samples of Zj and Zj. This structure was described 

^ ^ ^ 111 independently by Juttenetal. (1992) and by Piatt and Faggin 

yzCWO-JfiC'-tiWO-H-aa-CTj+Tj)) 35 (1992). 

The adaptation rule used for the filter coefficients in the 

As depicted m no. 2, these operations are accomplished by preferred embodiment is a variant of the LMS rule 

time-delay units 32 and 34, and combmers 36 and 38. ("Adaptive Signal Processing," Bernard Widrow and Sam- 

Under ^echoic conditions, these outputs 40 and 42, uel D. Steams, Prentice-Hall, Englewood CUflfe, N J., 1985. 

would be fuUy separated; i.e., each output 40 or 42 would 40 p 99) Th^ filter coefficients are updated at every time-step 
contain contributions firom one source alone. However under ^fter the new values of the outputs z, (n) and Z,(n) have 

echoic conditions these outputs 40 and 42 are not fuUy 5,,^ calculated. Specifically, using these new values of the 

separated. outputs, the filter coefficients are updated as follows : 

Under echoic and reverberant conditions, the microphone 
signals Xi(t) and XjO), which correspond to input signals 45 At(jfc)[ncw}-Ai(*)[old]+m r2(/i)2j(n-*)fc-i,2, .... 1000 

received by the microphones 10 and 12, respectively, may be ^ , ^ , , , , ^ 

modeled as /,2(t)[new>^(t)[o!d]+in ZiCrt)r2(n-*)A*u, . . . , 1000 

where m is a constant that determines the rate of adaptation 
x.(0-i(0-2(^-x2H*..'(0-H..(.M.2'(0*-2W gn^^ coefficients, e.g. 0.15 if the input signals 10 and 

^2(0-H'2(')+»*'i(f-x,H*2i'(0'**'i(0+*22'CO'w2(0. ^ norffialized to lie in the range -l^x(t)i+l. One 

skilled in the art will recognize that the filters ^ and h2 can 

where the symbol denotes the operation of convolution, be implemented in a variety of ways, including HRs and 

and the impulse responses kjj'(t), ki2'(t), ksj'O), and kj^XO lattice IIRs. 

incorporate the effects of echoes and reverberation. As described, the direct-signal separator 30 and crossUlk 

Specifically, kii'(t)*Wi(t) represents the echoes and rever- 55 remover 50 adaptively bring about full separation of two 

berations of source 1 (Wj(t)) as received at input 1 sound sources mixed in an echoic, reverberant acoustic, 

(microphone 10), ^^^^^{xY^Ji) represents the echoes and environment. However, the output signals z^{\) and Z2{t) 

reverberations of source 2 (w^O)) as received at input 1 may be unsatisfactory in practical applications because they 

(microphone 10 ), k2i'(t)*Wi(t) represents the echoes and are colored versions of the original sources Wi(t) and W2(t) 

reverberations of source 1 (w^(t)) as received at input 2 60 i.e., 

(microphone 12), and k22'(0*W2(0 represents the echoes and 

reverberations of source 2 (w2(t)) as received at input 2 

(microphone 12). ^i-ti(0'»*'i(0 
In consequence of the presence of echoes and ^ _v ^^y^ 

reverberation, the outputs 40 and 42 from the direct-signal 65 2 2 

separator 30 are not fully separated, but instead take the where ^^(t) and ^(t) represent the combined effects of the 

form echoes and reverberations and of the various known signal 
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transformations performed by the direct-signal separator 30 
and crosstalk remover 50. 

As an optional cosmetic improvement for certain com- 
mercial applications, it may be desirable to append filters 70 
and 72 to the network. The purpose of these filters is to undo 
the effects of filters ^^(t) and ^(t). As those familiar with the 
art will realize, a large body of techniques for performing 
this inversion to varying and predictable degrees of accuracy 
currently exist. 

The embodiment of the signal processor 8 has been 
described in FIGS. 1-5 as being useful with two micro- 
phones 10 and 12 for separating two sound sources, A and 
B. Clearly, the invention is not so limited. The forthcoming 
section describes how more than two microphones and 
sound sources can be accomodated. 
General Case with M Microphones and M Sources 

The invention is able to separate an arbitrary number M 
of simultaneous sources, as long as they are statistically 
independent, if there are at least M microphones. 

Let w^<t) be the j'th source signal and xXt) be the i'th 
microphone (mic) signal. Let t,y be the time required for 
sound to propagate from source j to mic i, and let d(t,y) be the 
impulse response of a filter that delays a signal by t,y. 
Mathematically, d(t,^) is the unit impulse delayed by t,y, that 
is 



where 6(t) is the unit impulse function ("Circuits, Signals 
and Systems", by William McC. Siebert. The MIT Press, 
McGraw Hill Book Company. 1986, p. 319). 

In the absence of echoes and reverberation, the i'th mic 
signal x/t) can be expressed as a sum of the appropriately 
delayed j source signals 



Af 



Matrix representation allows a compact representation of 
this equation for all M mic signals: 



where X(t) is an M-element column vector whose i'th 
element is the i'th mic signal x^t), D(t) is an MxM element 
square matrix whose ij'ih element (ie,, the element in the i'th 
row and j'lh column) is d(t,y), and W(t) is an M-element 
column vector whose j'lh element is the j'th source signal 
w^{t). Specifically, 



12 

-continued 



W(0 = 



10 



For each source w^^t), if the delays t,y for i=l,2, . . . , M to 
the M mics are known (up to an arbitrary constant additive 
factor that can be different for each source), then M signals 
yy(t), j-1,2, . . . , M, that each contain energy from a single 
but different source w^(t), can be constructed from the mic 
signals x/t) as follows: 



15 



where 



20 



25 



^(0 = 



is the M-element column vector whose j'th element is the 
separated signal y^{t), and adjD(t) is the adjugate matrix of 
the matrix D(t). The adjugate matrix of a square matrix is the 
matrix obtained by replacing each element of the original 
matrix by its cofactor, and then transposing the result 
30 ("Linear Systems", by Thomas Kailath, Prentice Hall, Inc., 
1980, p, 649). The product of the adjugate matrix and the 
original matrix is a diagonal matrix, with each element along 
the diagonal being equal to the determinant of the original 
matrix. Thus, 



35 



40 



Y{t) = adjDit)*X[t) 
= adjD{t)»D{t)* 

\\m\ 0 

0 0 
= |D(OI»W(/) 



W(0 

0 
0 



• \Dii)\ 



'Wit) 



45 



where tD(t)| is the determinant of D(t). Thus, 



yXf>.|/)(r)-w/f)for/-l,2 M 

y^{t) is a "colored" or filtered version of Wy(t) because of the 
convolution by the filter impulse response |D(t)|. If desired, 
this coloration can be undone by post filtering the outputs by 
a filter that is the inverse of |D(t)|. Under certain 
55 circumstances, determined by the highest frequency of inter- 
est in the source signals and the separation between the mics, 
the filter |D(t)| may have zeroes at certain frequencies; these 
make it impossible to exacdy realize the inverse of the filter 
|D(t)|. Under these circumstances any one of the numerous 
60 techniques available for approximating filter inverses (see, 
for example, "Digital Filters and Signal Processing", by 
Leland B. Jackson, Kluwer Academic Publishers, 1986, 
p. 146) may be used to derive an approximate filter with 
which to do the post filtering. 
65 The delays t^y can be estimated from the statistical prop- 
erties of the mic signals, up to a constant additive factor that 
can be different for each source. This is the subject of a 
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co-pending patent application by the same inventors, filed theory ("Adaptive Signal Processing," Bernard Widrow and 

even date herewith- Alternatively, if the position of each Samuel D. Steams, Prentice-Hall, Englewood Cliffs, N.J., 

microphone and each source is known, then the delays t,y can 1985, p. 99). 

be calculated exactly. For any source that is distant, i.e., This is most easily illustrated in the case of a discrete time 

many times farther than the greatest separation between the 5 system. 

mics. only the direction of the source is needed to calculate illustrative Weight Update Methodology for Use with a 

its delays to each mic, up to an arbitrary additive constant. Discrete Time Representation 

The first stage of the processor 8, namely the direct signal r- * i *u *• — * u j - * 

-^A .t- ' J . First, we replace the time parameter by a discrete time 

separator 30, uses the estimated delays to construct the . , « j .l * »• ti/ \r i * • j- . 

/ *. . . - 1. . L L index n. Second, we use the notation HfnXDew] to mdicate 

adjugale ma nx adjD 0, winch " aPPl'e* Jo microphone jo ^(n) in effect just before compwing new outputs 

signals X(t) to obtain the outputs Y(t) of the first stage, given ^, ^ ^ ^^/f, outputs Z(n) are compiled 

^' according to 



ZC«)-V(n)-//(n)[ncw]*Z(n) 



In the absence of echoes and reverberations, each output 

y/i) contains energy from a single but different source w^(t). Note that the convolution on the right hand side involves 

When echoes and reverberation are present, each mic only past values of Z, ie Z(n-1), Z(n-2), . . . , Z(n-N), 

receives the direct signals from the sources as well as echoes because the filters that are the elements of H are causal. (N 

and reverberations from each source. Thus is defined to be the order of the filters in H). 

20 

Now new values are computed for the coeflBcients of the 
" ^ filters that are the elements of H. These will be used at the 

Mt) = 2_^d{iij).wjU) + ^^e:j{o.wjii) for/= 1,2 M ^^^^ Specifically, for each j and each k, with j^k, 

perform the following: 

25 

where ejy(t) is the impulse response of the echo and rever- hjiiu)[o\d] = hji{u)[nev/] « = i, 2 m 

beration path from the j*th source to the i'th mic. All M of ». , w i , w . . , 

these equations can be represented m compact matnx nota- ^ y* l j ^-j* y **v 

tion by 

X{t)'D(tyw{t)+E(tyw{t) The easiest way to understand the operation of the second 

stage is to observe that the off-diagonal elements of H(t) 
where E(t) is the MxM matrbc whose ijth element is the fitter have zero net change per unit time when the products like 
e/,{t). z^{t)zjt(t-u) are zero on average. Because the sources in W 

If the mic signals are now convolved with the adjugate are taken to be statistically independent of each other, those 
matrix of D(t), instead of obtaining separated signals we 35 products are zero on average when each output z/t) has 
obtain partially separated signals: become a colored version of a different source, say Wy(t). 

(The correspondence between sources and outputs might be 
YiO = adjO(t)* xu) permuted so that the numbering of the sources does not 

match the numbering of the outputs.) 

More specifically, let 2^(t)=a|)(t)*W(t). From the preceding 
paragraph, equilibrium is achieved when ti)(t) is diagonal. In 
Notice that each y^{t) contains a colored direct signal from addition, it is required that: 
a single source, as in the case with no echoes, and differently 

colored components from the echoes and reverberations of Yit)-H{f)*lf{i)*W{t) 
every source, including the direct one. 
The echoes and reverberations of the other sources are 



= |D(/)| * W{i) + adjDU) . E{t) * Wit) 



|D(0| * ^(0 -I- adjDiO * E(f) * Wit) - H{t) * ^'(f) • ^(0 



removed by the second stage of the network, namely the = (|D(0|/ + a4'D(O*£"(0*-//(O**(0)* w(0 

crosstalk remover 50, which generates each output as fol- 
lows: 



50 



so that 



55 



where the entities hjjj^i) are causal adaptive filters, (The term This relation determines the coloration produced by the two 

"causal" means that hjf(i)^0 for t^O.) In matrix form these stages of the system, taken together, 

equations are written as An optional third stage can use any one of numerous 

techniques available to reduce the amount of coloration on 

m-m-m-zit) any individual output. 

where Z(t) is the M-element column vector whose j'th Example of General Case with M-3, i.e. with 3 Mics and 3 

element is 2^{t), and H(t) is an MxM element matrix whose Sources 

diagonal elements are zero and whose off diagonal elements In the case where there are 3 mics and 3 sources, the 

are the causal, adaptive filters hyi(t). 65 general matrix equation 

These filters are adapted according to a rule that is similar 

to the Least Mean Square update rule of adaptive filter XmD*w 



becomes 
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If the delays i-j are known, then the adjugate matrix of D(t) 
is given by 



16 



of sources emitting a significant amount of sound at any 
given instant in time does not exceed the number of micro- 
phones. For example, if during one interval of time only 
sources A and B emit sound, and in another interval of time 
only sources B and C emit sound, then during the first 
interval the output channels will correspond to A and B 
respectively, and during the second interval the output 
channels will correspond to B and C respectively. 



ad/D = 



dit22 + ^33) - dU2i + f32) + hi) " ^(^12 + ^33) ^^('12 + ^23) ' ^('22 + 'ij) 

'^U23+f3l)-rfU21 +'33) ^fUil +'33)-rfUj3+/3i) ^^(^21 + fli) " ^('11 + ^23) 
d{T2l + f32) - ^(^22 + '31 ) dim + Ol ) - ^('11 + /32) diU [ + f22) " d{t2l + tl2) 



Note that adding a constant delay to the delays associated 
with any column of D(t) leaves the adjugate matrix 
unchanged. This is why the delays from a source to the three 
mics need only be estimated up to an arbitrary additive 
constant. 

The output of the first stage, namely the direct signal 
separator 30, is formed by convolving the mic signals with 
the adjugate matrix. 











= adjD * 


X2 


.>3 







The network that accomplishes this is shown in FIG. 6. 

In the absence of echoes and reverberations, the outputs 
of the first stage are the individual sources, each colored by 
the determinant of the delay matrix. 











= ad/D * D{i) 
















= |D(01* 











In the general case when echoes and reverberation are 
present, each output of the first stage also contains echoes 
and reverberations from each source. The second stage, 
namely the cross talk remover 50, consisting of a feedback 
network of adaptive filters, removes the effects of these 
unwanted echoes and reverberations to produce outputs that 
each contain energy only one different source, respectively. 
The matrix equation of the second stage 

Z'Y~H*Z 

becomes 



■zi(0 




^iCO 




0 A,2(0 /ii3(0 




2l(0 


22(0 




>2(0 




/I2l(0 0 /I23(0 
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22(0 


.23(0 




^3(0 




Mxit) hn{t) 0 




23(0 



where each h^y is a causal adaptive filter. The network that 

accomplishes this is shown in FIG. 7. 

Conclusion 

It should be noted that the number of microphones and 
associated channels of signal processing need not be as large 
as the total number of sources present, as long as the number 



As previously discussed, in the preferred embodiment of 
the present invention, the invention has been implemented in 
software, as set forth in the microfiche appendix. The 

20 software code is written in the C++ language for execution 
on a workstation from Silicon Graphics. However, as pre- 
viously discussed, hardware implementation of the present 
invention is also contemplated to be within the scope of the 
present invention. Thus, for example, the direct signal 

25 separator 30 and the crosstalk remover 50 can be a part of 
a digital signal processor, or can be a part of a general 
purpose computer, or can be a part of analog signal pro- 
cessing circuitry. In addition, the present invention is not 
limited to the processing of acoustic waves. It can be used 

30 to process signals, having a delay with the problem of 
separation of the signals from the sources. 

There are many advantages and differences of the present 
invention from the prior art. 

1. Although the present invention is similar in objective to 
35 soimd separation based on auditory scene analysis, it 

differs from them in principle and in technique. 

2. In contrast with approaches based on auditory scene 
analysis, this invention separates sounds by exploiting the 
separation of their sources in space and their statistical 

40 independence. 

3. The present invention differs from directional micro- 
phones in that few presuppositions are made with regard 
to the locations of sound sources relative to the device. 
The present device need not be pointed at or mounted 

45 close to a source of interest. The necessary selectivity is 
brought about by processing of signals captured by a 
microphone array, i.e., a collection of microphones. 
Moreover, the selectivity attained is much greater: a 
directional microphone cannot completely suppress any 

50 source of sound. 

4. While the prior art nonlinear techniques can be very 
computationally eflBcient and are of scientific interest as 
models of human cocktail-party processing, they are of 
less practical or commercial significance than the present 

55 invention because of their inherent inability to bring about 
full suppression of unwanted sources. This inability origi- 
nates from the incorrect assumption that at every instant 
in time, at least one microphone contains only the desired 
signal. The present invention differs from these nonlinear 

60 techniques in that linear operations (addition, subtraction, 
and filtering) are employed to cancel unwanted sources. 

5. The present invention is an example of "active cancella- 
tion." In contrast with classical beamforming, which 
aligns copies of a desired source in time and adds them 

65 together, active cancellation matches copies of imdesired 
sources and subtracts them to bring about cancellation. 
("Matching" the copies of the undesired sources generally 
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entails more than simply re-aligning them in time; usually, 
it involves re-shaping them by filtering.) Bearing this 
simplified explanation in mind, it may be seen that the 
degree of selectivity achieved by active cancellation is 
determined by factors rather different from those impor- 5 
tant io classical beam forming. In active cancellation, the 
degree of selectivity is determined by the exactness with 
which identical copies of unwanted sources can be created 
from different microphones for subsequent cancellation. 
In contrast with a classical beamformer, a sound- lO 
separation device that employs active cancellation can in 
principle remove one undesired source completely using 
just two microphones. 

6. The present invention also does not need a reference 
sound. 15 

7. In contrast with active-cancellation algorithms that 
require a reference signal, the present invention operates 
"bhndly": it accommodates the more diflBcuU case in 
which all of the signals directly available for processing, 
i.e., the microphone signals, are assumed to be mixtures 20 
of sources. The present invention is a method for bringing 
about blind separation of sources, given the direction of 
arrival (DOA) of each source. This DOA information may 

be obtained in a variety of ways, for example by direct 
specification from a human user or by statistical estima- 25 
tion from the microphone signals. Precisely how DOA 
information is obtained is immaterial in the context of the 
present invention; what is important is how DOA infor- 
mation is used to bring about source separation. 

8. The present invention differs from gain-based active 30 
cancellation in that it requires no assumption of simulta- 
neity of signal reception at all of the microphones. 

9. In contrast with purely delay-based active cancellation 
and variants that introduce simple gains in addition to 
delays (e.g., Piatt & Faggin, 1992), the present invention 35 
is based on an acoustic model that includes the effects of 
echoes and reverberation. 

10. Single-stage techniques for blind, convolutive active 
cancellation are usually unable to separate mixtures that 
are not ah^eady partially separated. 40 

11. Two prior art techniques for blind, convolutive active 
cancellation are, like the present invention, based on a 
two-stage architectiire. Of these, the technique of Najar et 
al. (1994) differs from the present invention in that each 
output channel of its first stage is a filtered version of only 45 
one input channel. Therefore, the first stage of the system 
described by Najar et al. (1994) cannot bring about full 
separation even in the absence of echoes and 
reverberation, unless the original sources have no over- 
lapping spectral components. 50 

12. The other prior art technique based on a two-stage 
architecture is the Griffiths-Jim beamformer (Grifi&ths and 
Jim, 1982). The GriflSths-Jim beamformer employs active 
cancellation in its second stage that requires a reference 
signal. The necessary reference noise signal is produced 55 
by the first stage, using known DOA information. If this 
reference noise signal contains a significant amount of the 
desired signal, then the second stage will erroneously 
enhance the noise and suppress the desired signal (Van 
Comperaolle, 1989). In the present invention, the second 60 
stage is blind; it employs its own outputs as reference 
signals. Unlike the Griffiths-Jim reference signal, these 
become progressively purer with time as the network 
adapts. 

What is claimed is: 65 
1. A signal processing system for processing waves from 
a plurality of sources, said system comprising: 
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a plurahty of transducer means for receiving waves from 
said plurahty of sources, including echoes and rever- 
beration thereof and for generating a plurahty of signals 
in response thereto, wherein each of said plurality of 
transducer means receives waves from said plurality of 
sources including echoes and reverberations thereof, 
and for generating one of said plurality of signals; 

first processing means for receiving said plurality of 
signals and for generating a plurahty of first processed 
signals in response thereto, said first processing means 
comprises: 

a plurality of delay means, each for receiving one of 
said plurahty of signals and for generating a delayed 
signal in response thereto, and 

a plurality of first combining means, each for receiving 
at least one of said plurahty of signals and one of said 
delayed signals not associated with said one of said 
plurahty of signals, and for combining said received 
delayed signal and said signal, by an active cancel- 
lation process, to produce one of said first processed 
signals; and 

second processing means for receiving said plurality of 
first processed signals and for generating a plurality of 
second processed signals in response thereto, wherein 
each of said second processed signals represents waves 
from one different source, said second processing 
means including feedback means for supplying said 
plurahty of second processed signals to said second 
processing means for combining each of said plurality 
of second processed signals with at least one of said 
plurahty of first processed signals not associated with 
said each second processed signal to generate said 
plurality of second processed signals. 

2. A signal processing system for processing waves from 
a plurahty of sources, said system comprising: 

a plurahty of transducer means for receiving waves from 
said plurahty of sources, including echoes and rever- 
beration thereof and for generating a plurahty of signals 
in response thereto, wherein each of said plurality of 
transducer means receives waves from said plurahty of 
sources including echoes and reverberations thereof, 
and for generating one of said plurality of signals; 

first processing means for receiving said plurality of 
signals and for generating a plurahty of first processed 
signals in response thereto, said first processing means 
comprises: 

a plurality of multiplying means, each for receiving 
different ones of said plurality of signals and for 
generating a scaled signal in response thereto, and 
a plurahty of first combining means, each for receiving 
at least one of said plurahty of signals and one scaled 
signal not associated with said one of said plurality 
of signals, and for combining said received scaled 
signal and said signal to produce one of said first 
processed signals; and 
second processing means for receiving said plurahty of 
first processed signals and for generating a plurahty of 
second processed signals in response thereto, wherein 
each of said second processed signals represents waves 
from one different source, said second processing me 
including feedback means for supplying said plurality 
of second processed signals to said second processing 
means for combining each of said plurahty of second 
processed signals with at least one of said plurality of 
first processed signals not associated with said each 
second processed signal to generate said plurahty of 
second processed signals. 
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3. The system of claim 1, further comprising: 

means for generating a direction of arrival signal for said 
waves; and 

wherein said first processing means generates said plu- 
rality of first processed signals, in response to said ^ 
direction of arrival signal. 

4. The system of claim 1, wherein the number of trans- 
ducer means is two, the number of first processed signals is 
two, and the number of second processed signals is two. 

5. The system of claim 1, wherein said transducer means 
are spaced apart omnidirectional microphones. 

6. The system of claim 1 wherein said microphones are 
co-located directional microphones, 

7. The system of claim 1, 3, 4, 5, 6, or 2 wherein said 
second processing means comprises: 

a plurality of second combining means, each of said 
second combining means having a first input, at least 
one other input, and an output; each of said second 
combining means for receiving one of said first pro- 
cessed signals at said first input, an input signal at said 
other input, and for generating an output signal, at said 
output; said output signal being one of said plurality of 
second processed signals and is a difference between 
said first processed signal received at said first input 
and the sum of said input signal received at said other 
input; 

a plurality of adaptive filter means for generating a 
plurality of adaptive signals, each of said adaptive filter 
means for receiving said output signal from one of said 
plurality of second combining means and for generat- 
ing an adaptive signal in response thereto; and 

means for supplying each of said plurality of adaptive 
signals to one of said other input of said plurality of 
second combining means other than the associated one 
of said second combining means. 

8. The system of claim 7 further comprising means for 
filtering each of said second processed signals to generate a 
plurality of third processed signals. 

9. The system of claim 8 wherein said second processed ^ 
signals are characterized by having a low frequency com- 
ponent and a high frequency component, and wherein said 
filtering means boosts the low frequency component relative 

to the high frequency component of said second processed 
signals. ^5 

10. A signal processing system for processing waves from 
a plurality of sources, said system comprising: 

a plurality of transducer means for receiving waves from 
said plurality of sources, including echoes and rever- 
berations thereof and for generating a plurality of 50 
signals in response thereto, wherein each of said plu- 
rality of transducer means receives waves from said 
plurality of sources including echoes and reverbera- 
tions thereof, and for generating one of said plurality of 
signals; 55 
first processing means for receiving said pliu'ality of 
signals and for generating a plurality of first processed 
signals in response thereto, wherein in the absence of 
echoes and reverberations of said waves from said 
plurality of sources, each of said first processed signals 60 
represents waves from only one different source; said 
first processing means comprising: 
a plurality of delay means, each for receiving one of 
said plurality of signals and for generating a delayed 
signal in response thereto, and 65 
a plurality of first combining means, for receiving said 
plurality of signals and for feedforward combining 



said plurality of signals in an active cancellation 
process to produce said plurality of processed 
signals, wherein each of said plurality of first com- 
bining means receives at least one of said plurality of 
signals and one of said delayed signals not associated 
with said one of said plurality of signals, and for 
combining said received delayed signal and said one 
signal to produce one of said first processed signals; 
and 

second processing means for receiving said plurality of 
first processed signals and for generating a plurality of 
second processed signals in response thereto, wherein 
in the presence of echoes and reverberations of said 
waves from said plurality of sources, each of said 
second processed signals represents waves from one 
different source; said second processing means includ- 
ing feedback means for supplying said plurality of 
second processed signals to said second processing 
means for combining each of said plurality of second 
processed signals with at least one of said plurality of 
first processed signals not associated with said each 
second processed signal to generate said plurality of 
second processed signals. 

11. A signal processing system for processing waves from 
a plurality of sources, said system comprising: 

a plurality of transducer means for receiving waves from 
said plurality of sources, including echoes and rever- 
berations thereof and for generating a plurality of 
signals in response thereto, wherein each of said plu- 
rality of transducer means receives waves from said 
plurality of sources including echoes and reverbera- 
tions thereof, and for generating one of said plurality of 
signals; 

first processing means for receiving said plurality of 
signals and for generating a plurality of first processed 
signals in response thereto, wherein in the absence of 
echoes and reverberations of said waves from said 
plurality of sources, each of said first processed signals 
represents waves from only one different source; said 
first processing means comprising: 
a plurality of first combining means, for receiving said 
plurality of signals and for feedforward combining 
said plurality of signals in an active cancellation 
process to produce said plurality of processed 
signals, 

a plurality of multiplying means, each for receiving 
different ones of said plurality of signals and for 
generating a scaled signal in response thereto; and 
wherein each of said plurality of first combining means 
receives at least one scaled signal and one of said 
plurality of signals not associated with said one 
scaled signal, and for combining said received scaled 
signal and said signal to produce one of said first 
processed signals; 
second processing means for receiving said plurality of 
first processed signals and for generating a plurality of 
second processed signals in response thereto, wherein 
in the presence of echoes and reverberations of said 
waves from said plurality of sources, each of said 
second processed signals represents waves from one 
different source; said second processing means includ- 
ing feedback means for supplying said plurality of 
second processed signals to said second processing 
means for combining each of said plurality of second 
processed signals with at least one of said plurality of 
first processed signals not associated with said each 
second processed signal to generate said plurality of 
second processed signals. 
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12. The system of claim 10 wherein said waves are 
acoustic waves, and said transducer means are microphones. 

13. The system of claim 12 further comprising means for 
filtering each of said second processed signals to generate a 
plurality of third processed signals. 5 

14. The system of claim 13 wherein said second processed 
signals are characterized by having a low frequency com- 
ponent and a high frequency component, and wherein said 
filtering means boosts the low frequency component relative 
to the high frequency component of said second processed 
signals. 

15. The system of claim 10, wherein the number of 
transducer means is two, the number of first processed 
signals is two, and the number of second processed signals 
is two. 

16. The system of claim 10, wherein said transducer 
means are spaced apart omnidirectional microphones. 

17. The system of claim 10 wherein said microphones are 
co-located directional microphones. 

18. The system of claim 10, 12, 13, 14, 15, 16, 17 or 11 
wherein said second processing means comprises: 

a plurality of second combining means, each of said 
second combining means having a first input, at least 
one other input, and an output; each of said second 
combining means for receiving one of said first pro- 
cessed signals at said first input, an input signal at said 
other input, and for generating an output signal, at said 
output; said output signal being one of said plurality of 
second processed signals and is a difference between 
said first processed signal received at said first input 
and the sum of said input signal received at said other 
input; 

a plurality of adaptive filter means for generating a 
plurality of adaptive signals, each of said adaptive filter 
means for receiving said output signal from one of said 
plurality of second combining means and for generat- 
ing an adaptive signal in response thereto; and 

means for supplying each of said plurality of adaptive 
signals to one of said other input of said plurality of 
second combining means other than the associated one ^ 
of said second combining means. 

19. The system of claim 18 wherein each of said adaptive 
filter means comprises a tapped delay line. 

20. A method of processing waves from a plurality of 
sources, comprising: 

receiving said waves, including echoes and reverberations 
thereof, by a plurality of transducer means; 

converting said waves, including echoes and reverbera- 
tions thereof from said plurality of sources, by each of 
said plurality of transducer means into an electrical 50 
signal, thereby generating a plurality of electrical sig- 
nals; 

first processing said plurality of electrical signals, by an 
active cancellation process, to generate a plurality of 
first processed signals, wherein in the absence of ech- 55 
oes and reverberations of said waves from said plurality 
of sources, each of said first processed signals repre- 
sents waves from one source, and a reduced amount of 
waves from other sources, said first processing step 
including: 60 
delaying each one of said plurality of electrical signals 

and generating a delayed signal in response thereto, 

and 

combining each one of said plurality of electrical 
signals with one of said delayed signals not associ- 65 
ated with said one of said plurality of signals to 
generate one of said first processed signals; and then 



secondly processing said plurality of first processed sig- 
nals to generate a plurality of second processed signals, 
including combining each of said plurality of second 
processed signals with at least one of said plurality of 
first processed signals not associated with said each 
second processed signal to generate said plurality of 
second processed signals, wherein in the presence of 
echoes and reverberations of said waves from said 
plurality of sources, each of said second processed 
signals represents waves from only one different 
source. 

21. The method of claim 20 further comprising the step of: 
filtering each of said second processed signals to generate 

a plurality of third processed signals. 

22. The method of claim 20 further comprising the step of: 
sampling and converting each one of said plurafity of 

electrical signals and for supplying same to said plu- 
rality of delay means and to said plurality of combining 
means, as said electrical signal. 

23. The method of claim 20 wherein said second process- 
ing step fiirther comprises: 

subtracting, by a plurality of subtracting means, a differ- 
ent one of said first processed signals by an adaptive 
signal and generating an output signal, thereby gener- 
ating a plurality of output signals; 

adaptively filtering said output signals to generate a 
plurality of adaptive signals; and 

supplying each one of said plurality of adaptive signals to 
a different one of said subtracting means. 

24. A signal processing system for processing waves from, 
a plurality of sources, said system comprising: 

at least a first and second transducer for receiving waves 
from said plurality of sources, including echoes and 
reverberation thereof and for generating at least a first 
and a second signal in response thereto, wherein each 
of said transducers receives waves from said plurality 
of sources including echoes and reverberations thereof, 
and for generating one of said first and second signals; 

first processing means for receiving said first and second 
signals and for generating a first and a second pro- 
cessed signals in response thereto, said first processing 
means comprises: 

first delay means for receiving said first signal and for 
generating a first delayed signal in response thereto, 

second delay means for receiving said second signal 
and for generating a second delayed signal in 
response thereto, 

first combining means for receiving said first signal and 
said second delayed signal, and for combining said 
received first signal and said second delay signal, by 
an active cancellation process, to produce said first 
processed signal, and 

second combining means for receiving said second 
signal and said first delayed signal, and for combin- 
ing said received second signal and said first delayed 
signal, by an active cancellation process, to produce 
said second processed signal; and 
second processing means for receiving said first and 

second processed signals and for generating a third and 

a fourth processed signals in response thereto, said 

second processing means comprises; 

third combining means for receiving the first processed 
signal to produce the third processed signal in 
response thereto; 

fourth combining means for receiving the second pro- 
cessed signal to produce the fourth processed signal 
in response thereto; 
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first adaptive filter means for receiving said third pro- 
cessed signal, for generating a first adaptive signal in 
response thereto, and for supplying said first adap- 
tive signal to said fourth combining means; 

second adaptive filter means for receiving said fourth 
processed signal, for generating a second adaptive 
signal in response thereto, and for supplying said 
second adaptive signal to said third combining 
means; 

wherein the third combining means combines the first 
processed signal and the second adaptive signal to 
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produce the third processed signal so that the third 
processed signal is a difference between the first pro- 
cessed signal and the second adaptive signal; and 

wherein the fourth combining means combines the second 
processed signal and the first adaptive signal to produce 
the fourth processed signal so that the fourth processed 
signal is a difference between the second processed 
signal and the first adaptive signal. 
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ABSTRACT 



An apparatus for detennining an angle of incidence of a 
received signal utilizes two widely spaced interferome- 
ters with equal length crossing baselines. The time dif- 
ferential of a signal arrival between the two elements of 
each interferometer is established and the difference 
between the two time differentials is determined. Multi- 
plying this difference by a predetermined constant pro- 
vides the desired angle. 

9 Claims, 9 Drawing Figures 
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AXTo» t:> 'rn a r^wi- .-u. Hzimuthal plane and thereby cannot resolve targets 

ANGLE TRACKING SYSTEM within the azimuthal beam width of the arrays. This 

T> * ^T^^« ^ limitation of bearing resolution causes the dual array to 

BACKGROUND OF THE INVENTION respond to the cenfroid of multiple targeU within the 

1. Field of the Invention ^ aximuthal beam width. 
The invention pertains to angle of signal arrival mea- crixMxjr n 

surements and more specifically to the measurement of SUMMARY OF THE INVENTION 

the vertical angle of the signal arriving from a source An angle measuring system constructed in accor- 

not necessarily in the horizontal plane of the receiver. dance with the principles of the present invention in- 

2. Description of the Prior Art 10 eludes two widely separated split array pairs, each pair 
Interferometric methods for measuring arrival angles comprising upper and lower arrays with parallelly 

of received signals have long been in use in the sonar scanned receiving beams. The upper array of one pair 

and radar art In these systems, the tmie difference of and the lower array of the other are coupled to fom 
^l ^!^!^""^"^^^ f P^^^;™*^ ^T^"^ ,5 two wide based interferometer pairs. In ^ch pair the 

^^u J Tl"^ ^ corre auon techmques. if the 15 differential between the arrays is 

signals are continuous, or by timmg techmques. if the jI* : ^ j *:r j * . i_f i. . ^ ^^''^ « 

siinals are pukes. This diffeLce in toe of arriv^ imd 1^^^ ^'^ ""'^'^ *° '^^^ 
the separation distance is utilized to determine the angle 

of arrival of the signal with an accuracy that is a func- BRIEF DESCRIPTION OF THE DRAWINGS 

tionof the receiver separation, improving as the separa- 20 . , , 

tion increases. The measured angle is in the plane of * " * pictonal representation of dual arrays 

signal propagation defined by the two receivers and the ^^^^ed on a ship with ray Imes to the target indicated 
source. In many applications, however, the horizontal 

plane source angle Le. azimuth or bearing is required HGS. 2A and 2B depict the ray line geometry be- 
rather than the angle in the signal propagation plane 25 tween a target of interest and the dual split arrays, 

provided by the interferometer. A conversion from the Zi&a. tabulation of formulas useful for explain- 

signal propagation angle to the horizontal plane angle invention. 

may readily be realized with knowledge of the vertical Figure numeral 4a and 46 comprise a block diagram 

angle of arrival of the signal. Vertical angle of arrival is ^ embodiment of the invention, 
also useful for other q>plications, such as establishing 30 FIG. 5 is a geometrical representation of ray paths 

the relative altitude or depth of the signal source. Prior within an array beamwidth that is useftil in the determi- 

art methods of measuring vertical angle are of limited nation of the length of the delay line employed in the 

accuracy and resolution capability. invention. 

Vertical angle measurements have been made in the FIG. 6 depicts the ray line geometry of a three re- 
prior art with two beams having peaks offset at equal 35 ceiver implementation of the invention, 

and opposite angles from a reference angle to establish HG. 7 is a block diagram of another embodiment of 

equal amplitude responses for signals incident from the the invention, 
reference direction. In a sonar system, an acoustic signal 

arriving from the reference angle direction induces DESCRIPTION OF THE PREFERRED 

electrical signals in the beam transducers of equal mag- 40 EMBODIMENTS 

A^^nlf"^- * f '° difference th^ebe- i„ piG. 1, sonar arrays 11, 12 are shown mounted on 
ZTntt^T^ ^S^rJ^^^ from angular direc- a ship along a wide baseline 13 which is coincident with 
Sals Sit ttoLl^rS^L^^ atSiSe'Li ^ ^^^^^ ^' ^ '^'^ 
function oftheangleofft^er^fS^^^^ 45 Hru'^Lr^'t T '"nm.^' 
ity which is deterged by whether the arrival angle is "^^^^ v^T'!^ ^'"""^ "^f^^"' ^ 
less than or greater than the reference angle. THeLu- ^^^^^^ ariays while the depression angle Ei) is 
racy of these systems is generaUy poor, being a function "^^^.^ from the horuontal 16. This depression angle 
of the relative beam shapes of the transducers and the determined from the separaUon 17 between the 
interpolation. 50 centers of the split array and the deferential time 
Greater accuracy than that obtamable with ampUtude ^^^^ therebetween of a signal emitted from the 
interpolation systems may be achieved with the utiliza- ^^^^ ?* ^^^^ separations of the phase centers of 
tion of a vertically split array to effectively establish ^® ^P^^ arrays A, B and C, D is short, the received 
two transducer arrays with a physical separation there- ^^^^ ^ adjacent split arrays is correlated, as for exam- 
between. These systems determine the time difference 55 received by the split arrays A, B. This 
of arrival of a signal incident to the dual array. This tune correlation of received noise adversely affects the de- 
difference is a function of the angle from the perpendic- pression angle measurement accuracy. Additionally, 
ular to the array surface and does not depend on the ^^^^ the signals received by the four arrays A, D. C, B 
array beam shape, being only a function of the angle of fr^^i the target O are processed as in the prior art, 
arrival and the dual array separation. Array separation, 60 signals from targets at O and O' that are within an array 
however, is generally small requiring that the difference beamwidth cannot be resolved and angular measure- 
between two nearly equal times of arrival be deter- ments corresponding to the target signal centroid result 
mined, thus limiting the accuracy of the system. Addi; These deficiencies of the prior art are remedied by de- 
tionally, the small separation between the dual arrays termining the differential time of arrival of received 
causes the noise at the output terminals of each array to 65 signals at subarrays A, D and subarrays C, B, as will be 
be correlated, adversely affecting the signal-to-noise described subsequently. The long baseline established 
ratio of the system and concomitanUy the angle deter- between the correlating subarrays significantiy reduces 
mination accuracy. Further, the dual arrays overiap in tiie noise signal correlation at the subarrays and signifi- 
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cantly improves the target resolution capability of the 
system- 
Refer now to FIG, 2A wherein previously referenced 
elements are given the previously assigned reference 
numerals and wherein line RQ is coincident with a 5 
vertical reference axis. Consider the triangle OPQ, 
where Q is the midpoint between the phase centers of 
the subarrays A, C and is h/2 vertically distant from the 
midpoint of the overall array P, h being the vertical 
subarray phase center separation 17. The horizontal 10 
subarray phase center separation being 1. The applica- 
tion of the law of cosines to the triangle OPQ yields the 
equation 1 of FIG. 3 for the cosine of the angle }ff be- 
tween the vertical PQ and the range ray OP. Similarly, 
the application of the law of cosines to the triangle 
OPR, where R is the midpoint between the subarrays B, 
D yields the expression for cos ^ given by equation 2. 
Equation 3 is an expression for cos ijr in terms of the 
distances OR, OQ, and OP that is obtained by subtract- 
ing equation 2 from equation 1. Expressions for dis- 
tances OQ and OR shown in equations 4 and 5 may be 
obtained by applying the law of cosines to triangles 
OAC, OAQ and OBD, OBR, respectively. Substituting 
equations 4 and 5 into equation 3 yields the expression 
for cos ijr given in equation 6. Expressions for the brack- 
eted terms in the numerator of equation 6 may be ob- 
tained by applying the law of cosines to the triangles in 
the planes OAD and OBC. These expressions are 
shown in equations la and 76. Equation 7c states that 
the diagonal distance AD is equal to the diagonal' dis* 
tance BC This is a consequence of the positioning of 
the phase centers of the subarrays at the comers of a 
rectangle. The substitutions of the equations 7 into 
equation 6 yields an expression for cos ifr in terms of the 33 
angles of incidence to the diagonally positioned inter- 
ferometers A, D and C, B are shown in equation 8. 

Refer now to FIG. 2B wherein ray 21 and ray 22 are 
respectively shown incident to subarray A and subarray 
D. Rays 21, 22 correspond to the rays OA and OD of 40 
FIG. 2A, respectively, when the position O is at a dis- 
tance from the array that is very much greater than the 
length of the baseline AD. Under these conditions, the 
rays 21, 22 are substantially parallel and the path length 
difference to the subarrays A, D of signals emitted from 45 
a target at position O may be determined by dropping a 
perpendicular from the subarray A to intercept ray 22 at 
a point 23. The distance between point 23 and subarray 
D is therefore the differential path length to subarrays 
A,D of a signal emitted from a target of position O. 30 
Since the rays 21, 22 are substantially parallel, it follows 
that the ray OP of FIG. 2A is substantially parallel to 
these rays and the angle between AD and D (23) is 
equal to the angle 62 shown in FIG. 2A. Similarly, the 
differential ray paths of a signal emitted by a target at 55 
position O to the phase centers of subarrays B, C may be 
determined from the baseline BC and the angle ^i. 
From the above it is apparent that cos 6{ and cos 6i are 
given by equations 9a, 9b, wherein Vi is die propagation 
velocity of the wavefront 33 and tt and tiare the differ- 60 
ential times of arrival. Consequently, the difference 
between the differential time delays of the interferome- 
ters AD, BC determines the depression angle ifr from 
the vertical as given in equation 10. Since the depression 
angle from the horizontal Ed is equal to the depression 65 
angle from the vertical less 90 degrees as shown in 
equation 11. it follows that depression angle from the 
vertical Ex? is given by equation 12. 



Referring now to FIG. 4, a first array of receiving 
elements 25 may be coupled to beam formers 26, 27 to 
form upper and lower receiving beams and a second 
array of elements 28 may be coupled to beam formers 
29, 30 to form a second set of upper and lower receiving 
beams. A phase front 33 incident to arrays 25, 28 in- 
duces a signal at output port 34 of beam former 26, 
output port 35 of beam former 27, output port 36 of 
beam former 29, and output port 37 of beam former 30. 
Each of the output ports couple to signals that arrive at 
angles within a beam centered at an angle 6i from the 
baseline between the phase centers of an upper and 
lower beam combination, as for example the phase cen- 
ter of the array of elements coupled to upper beam 
former 26 and the phase center of the array of elements 
coupled to lower beam former 30, as represented in 
FIG. 5. Signals at the output port 35 of lower beam 
former 27 are coupled through a delay line 41 to the 
input terminal of a tapped delay line 42, while signals at 
output terminal 36 of upper beam former 29 are coupled 
via line 43 to the input terminal of a tapped delay line 
44. Each tap on the tapped delay line 44 is an incremen- 
tal fine time delay from the coarse time delay of delay 
line 41. The input terminals to delay lines 42, 44 are 
oppositely positioned to provide differential time delays 
as will be explained subsequently. Corresponding out- 
put terminals of delay lines 42, 44 are coupled to corre- 
lators 45 wherein signals at corresponding taps are cor- 
related and wherefrom a multiplicity of correlation 
signals is coupled to a processor 46. 

Delay line 41 provides a delay Id that is given by 
equation 13 in FIG. 3, L being the length of the baseline. 
When a signal arrives at an angle 6i corresponding to 
the beam peak, the delayed output signal from terminal 
35 and the undelayed output signal from terminal 36 
arrive at the input terminals of delay lines 42, 44 at 
substantially the same time and are thereby in phase at 
the central taps of the delay lines 42, 44, thus establish- 
ing a correlation output signal from the correlator cou- 
pled to the central tap that exceeds all output signals 
from the correlators coupled to the other taps of the 
delay lines. If the signal is incident at an angle other 
than that corresponding to the beam peak, the signals 
arrive at the input terminals at delay lines 42, 44 with a 
differential time lag therebetween. This differential time 
lag causes the peak correlation signal to appear at a tap 
on either side of the central tap. The tap at which this 
peak appears is a function of the incident angle to the 
arrays 25, 28. If the beam width at the output termmals 
35. 36 is ^5 as shown in FIG. 5 and the length S of the 
tap delay lines is chosen such that a signal arriving at an 
angle di—Osn establishes a maximum correlation signal 
at the correlator 47 coupled to the last tap of the delay 
line 42 and the first tap of delay line 44, S is then given 
by equation 14 of FIG. 3 wherein Viis the propogation 
velocity along the tapped delay lines. For an angle 6' 
within the beam the maximum correlations will occur at 
a distance from the input end of tap delay line 43 that is 
given by equation 15. The output signals from the cor- 
relators 45 are coupled to processor 46, which may 
contain a network of comparators and logic circuits, to 
determine the taps that give rise to the maximum corre- 
lation signal, thereby determining the time differential 
of the received signals at the phase centers of the inter- 
ferometer formed by the upper beam former 29 and the 
lower beam former 27. Similarly, the time differentials 
of received signals at the phase centers of the upper 
beam former 26 and the lower beam former 30 are de- 
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termined by the delay line 49, tap delay lines 51, 52 and 
correlators 53. Time differentials so determined are 
coupled to substraction network 54 wherefrom the 
difference in the time differentials, given by equation 16, 
is coupled to a computer 55 for the determination of a 5 
depression angle in accordance with equation 12. Com- 
puter 55 contains a multiplier 56 to which signals repre- 
sentative of Vi/2h and (ti -tz) are coupled for multipli- 
cation to provide a signal at an output tenmnal 57 that 
is representative of the depression angle Ed- 

The signals coupled to each of the correlators 45, 53 
emanate from the same source but arrive at the correla- 
tors with time delay differentials that are functions of 
the angles of incidence and the time delays of the sys- 
tem. Thus, each correlator performs an autocorrelation 
in accordance with the well known formula 17 of FIG. 
3. Autocorrelation function equation 17 provides a peak 
value when the differential time delay r is zero. Thus, 
the correlator that provides the maximum correlation 
signal is that for which the signals coupled thereto ar- ^ 
rive in phase. It should be recognized by those skilled in 
the art that the correlators at which the correlation 
signals are maximum are equally and oppositely dis- 
placed from the central correlator in correlators 45, 53. 
For each target within a subarray beam, an additional 
pair of peak correlations are induced in the correlators 
45, 53. Since the peaks of each pair are equally and 
oppositely spaced from the central correlator, each pair 
may be identified and appropriately processed as de- ^ 
scribed above. In this manner, multiple targets within a 
subarray beam, but at varying depression angles may be 
resolved and tracked. 

It should be recognized that the above-described 
angle determination may also be accomplished by posi- 35 
tioning a receiver at point P, the crossover of the two 
base lines AD, CB, to form a triangle with receivers at 
points C and D, as shown in FIG. 6. The time differ- 
ences of arrival ti' and ti' measured between the re- 
ceiver at P and the reed vers at C and D are one-half the 40 
time differences of arrival ti and ii, respectively. With 
the substitution ti=2ti' and t2=2t2' the equations of 
FIG. 3 may be utilized to determine the vertical angle 
i/f. The measurement of the time of arrival may be ac- 
complished in the manner previously described. 45 

Referring to FIG. 7, wherein elements previously 
discussed bear the prior assigned reference numerals, 
the output signals of the receiver at C may be coupled 
to the tapped delay line 44 while the output signals of 
the receiver at D may be coupled via coarse delay line 50 
49 to tapped delay line 52. Output signals of the receiver 
at P are coupled from node 58 directiy to tapped delay 
line 51 and via coarse delay line 41 to tapped delay line 
42. 

While the invention has been described in its pre- 55 
ferred embodiments, it is to be understood that the 
words which have been used are words of description 
rather than limitation and that changes may be made 
within the purview of the appended claims without 
departing from the true scope and spirit of tiie invention 60 
in its broader aspects. 

I claim: 

1. A passive apparatus for measuring an angle to a 
signal emitter comprising: 

first and second means positioned with a predeter- 65 
mined separation distance along a first axis therebe- 
tween for recdvmg signals emitted from said emit- 
ter signal; 



third and fourth means positioned with said predeter- 
mined separation distance along said first axis for 
receiving said emitted signals, located a preselected 
distance from said first and second means along a 
second axis and relatively positioned ialong said 
first axis such that said first and third means and 
said second and fourth means are correspondingly 
positioned along said first axis; 

time difference means coupled to said four receiving 
means for providing a signal representative of a 
first differential time ti between a signal arrival at 
said first receiving means and said signal arrival at 
said fourth receiving means and a signal representa- 
tive of a second differential time ti between said 
signal arrival at said second receiving means and 
said third receiving means: and 

means for providing a signal representative of a dif- 
ference between times ti and t2, said difference 
between ti and t2 being representative of said angle 
to said si^ial emitter. 

2. A passive angle measuring apparatus in accordance 
with claim 1 wherein said time difference means in- 
cludes: 

delay means coupled to receive signals from said 
second and fourth receiving means for providing 
time delays substantially equal to time differentials 
of arrival at predetermined angles of incidence 
between said first and fourth and said third and 
second receiving means; 

fine delay means coupled to said delay means to re- 
ceive delayed signals from said second and fourth 
receiving means and to receive signals directiy 
from said first and third receiving means for pro- 
viding a multiplicity of fine time delays to said 
delayed signals and to said directiy received sig- 
nals; and 

means coupled to said fine delay means to receive 
said delayed signals and said directiy received sig- 
nals in pairs, after correspondmg fme time delays, 
for correlating said fine time delayed pairs of de- 
layed signals and directiy received signals. 

3. A passive receiving apparatus for angular measure- 
ments in accordance with claim 2, wherein said fine 
delay means includes: 

first tapped delay line means for providing said fine 
time delays having input means at ends thereof 
coupled to receive signals directiy from said first 
and third receiving means; and 

second tapped delay means for providing said fine 
time delays having input means at ends thereof 
opposite said receiving ends of said first tapped 
delay means coupled to receive signals from said 
second and fourth receiving means delayed 
through said delay means; 

said first and second tapped delay means having taps 
correspondingly positioned to form tap pairs, each 
pair of taps coupled to said correlating means. 

4. A method of measuring an angle to a signal emitter, 
which comprises: 

receiving a signal from said signal emitter at fu^t, 
second, third, and fourth receiving means, posi- 
tioned with phase centers at comers of a predeter- 
mined rectangle, said first and fourth receiving 
means and said second and third receiving means 
forming diagonally positioned pairs; 

delaying signals received by said second and fourth 
receiving means for a time duration determined by 
said rectangular diagonal length and a signal angle 
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of incidence corresponding to a peak of a receiving 
beam; 

coupling said delayed signals from said second re- 
ceiving means and signals from said third receiving 
means to first and second fme delay means, respec- 
tively, said first and second fme delay means hav- 
ing a multiplicity of corresponding output terminal 
pairs, each pair representative of an angle within 
said receiving beam; 

coupling said delayed signals from said fourth receiv- 
ing means and signals from said first receiving 
means to third and fourth fine delay means, respec- 
tively, said third and fourth fme delay means hav- 
ing a multiplicity of corresponding output terminal 
pairs, each pair representative of an angle within 
said receiving beam; 

correlating signals at said terminal pairs of said first 
and second fme delay means and said third and 
fourth fine delay means; 

determining output terminal pairs at which peak cor- 20 
relation signals occur; and 

establishing angle of signal incidence from said termi- 
nal pair peak correlation signal determination. 

5. The method of claim 4, wherein the step of estab- 
lishing said angle of signal incidence includes: 25 

determining the differential time of arrival of said 
incident signal between said first and fourth receiv- 
ing means and between said second and third re- 
ceiving means; 

subtracting the differential time of arrival between 30 
said first and fourth receiving means from said 
difTerential time of arrival between said second and 
third receiving means; and 

multiplying said difference between said differential 
time delays by a predetermined constant factor. 

6. A passive apparatus for measuring an angle to a 
signal emitter comprising: 

first and second means positioned on a first axis with 
a predetermined separation distance therebetween 
for receiving signals emitted from said signal emit- 
ter, 

third means located a preselected distance from said 
first and second means along a second axis for 
receiving signals emitted from said signal emitter; 

delay means coupled to receive signals from said first 45 
and third receiving means for providing time de- 
lays substantially equal to time differentials of ar- 
rival at predetermined angles of incidence between 
said first and third and said second and third receiv- 
ing means; 

fine delay means coupled to said delay means to re- 
ceive delayed signals from said first and third re- 
ceiving means and to receive signals directly from 
said second and third receiving means for provid- 
ing a multiplicity of time delays to said delayed 
signals and to said directly received signals, said 
time delays and said fine time delays providing a 
first differential time ti between a signal arrival at 
said first receiving means and a signal arrival at said 
third receiving means, and a second differential 60 
time t2 between a signal arrival at said second re- 
ceiving means and a signal arrival at said third 
receiving means; 

means coupled to said fine delay means to receive 
said delayed signals and coupled to receive said 
directly received signals, after corresponding fine 
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time delays, for correlating said fine time delayed 
signals and directly received signals; and 
means for providing a signal representative of differ- 
ence between times ti and ii that is representative 
of said angle to said signal emitter. 

7. A passive receiving apparatus for angular measure- 
ments in accordance with claim 6, wherein said fme 
delay means includes: 

first tapped delay line means for providing said fine 
time delays having input means at ends thereof 
coupled to receive signals directly from said sec- 
ond and third receiving means; and 

second tapped delay means for providing said fine 
time delays having input means at ends thereof 
opposite said receiving ends of said first tapped 
delay means coupled to receive signals from said 
first and third receiving means delayed through 
said delay means; 

said first and second tapped delay means having taps 
correspondingly positioned to form tap pairs, each 
pair of taps coupled to said correlating means. 

8. A method of measuring an angle to a signal emitter, 
which comprises: 

receiving a signal from said signal emiter at first, 
second and third receiving means, positioned with 
phase centers at comers of a predetermined iso- 
celes triangle with a base through said phase cen- 
ters of said first and second receiving means; 

delaying signals received by said first and third re- 
ceiving means for a time duration determned by 
said isoceles triangle side length and a signal angle 
of incidence corresponding to a peak of a beam 
incident to said receiving means: 

coupling said delayed signals from said first and third 
receiving means to first and second fine delay 
means, respectively, said first and second fine delay 
means having a multiplicity of output terminals; 

coupling said signals from said second and third re- 
ceiving means directly to third and fourth fme 
delay means, respectively, said third and fourth 
fine delay means having a multiplicity of output 
terminals respectively corresponding to output 
terminals of said first and second fine delay means 
to form terminal pairs, each pair representative of 
an angle within said beam incident to said receiving 
means; 

correlating signals at said terminal pairs of said first 
and third fme delay means and said second and 
fourth fme delay means; 

determining output terminal pairs at which peak cor- 
relation signals occur, and 

establishing angle of signal incidence from said termi- 
nal pair peak correlation signal determination. 

9. The method of claim 8, wherein the step of estab- 
lishing said angle of signal incidence includes: 

determining the differential time of arrival of said 
incident signal between said first and third receiv- 
ing means and between said second and third re- 
ceiving means; 

subtracting the differential time of arrival between 
said first and third receiving means from said dif- 
ferential time of arrival between said second and 
third receiving means; and 

multiplying said difference between said differential 
time delays by a predetermined constant factor. 
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[57] ABSTRACT 

Methods and systems for beamforming are disclosed that 
include a signal processor that can dynamically determine 
the relative time delays between a plurality of frequency- 
dependent signals. Hie signal processor can adaptively 
generate a beam signal by aligning the plural frequency- 
dependent signals according to the relative time delays 
between the signals. The signal processor can store one 
frequency-dependent signal as a reference signal and can 
align the remaining frequency-dependent signals relative to 
this reference signal. Chie advantage of the signal processor 
is that it can align the plural frequency-dependent signals 
generated by an airay of microphones that can be arranged 
in a linear, two dimensional or three dimensional array and 
located in a room environment 
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METHODS AND APPARATUS FOR 
ADAPTIVE BEAMFORMING 

This invention was made with government support under 
Grant/Contract No. MIP-9120843 awarded by the National 5 
Science Foundation. The government has certain rights in 
this invention. 

FIELD OF THE INVENTION 

10 

The present invention relates to methods and apparatus 
for adaptive signal processing and, more particularly, to 
methods and apparatus for adaptively combining a plurality 
of signals, e.g., electrically represented audio signals, to 
form a beam signal. 

BACKGROUND OF THE INVENTION 

Many communication systems, such as radar systems, 
sonar systems and microphone arrays, use beamforming to 
enhance the reception of signals. In contrast to conventional 
communication systems that do not discriminate between 
signals based on the position of the signal source, beam- 
forming systems are characterized by the capability of 
enhancing the reception of signals generated from sources at 
specific locations relative to the system. ^ 

Generally, beamforming systems include an array of 
spatially distributed sensor elements, such as anteimas, sonar 
phones or microphones, and a data processing system for 
combining signals detected by the airay. The data processor 
combines the signals to enhance the reception of signals 
from sources located at select locations relative to the sensor 
elements. Essentially, the data processor "aims" the sensor 
array in the direction of the signal source. For example, a 
linear microphone array uses two or more microphones to 
pick up the voice of a talker. Because one microphone is 
closer to the talker than the other microphone, there is a 
slight time delay between the two microphones. The data 
processor adds a time delay to the nearest microphone to 
coordinate these two microphones. By compensating for this ^ 
time delay, the beamforming system enhances the reception 
of signals from the direction of the talker, and essentially 
aims the microphones at the talker. 

A major factor in the effectiveness of these beamforming 
systems is the accuracy of the time delays necessary for 45 
aiming the sensor array. One known technique for deter- 
mining the time delays necessary for aiming the sensor array 
employs a priori knowledge of the source position, the 
source orientation and the radiation pattern of the signal. 
Essentially, the data processor determines from the position 5Q 
of the source and, from the position of the sensor elements, 
a delay factor for each of the sensor elements. The data 
processor then applies such delay factors to the respective 
sensor elements to aim the sensor array in the direction of 
the signal source. 55 

Although these systems work well if the position of the 
signal source is precisely known, the effecdveness of these 
systems drops off dramatically with slight errors in the 
estimated a priori informadon. For instance, in some sys- 
tems with source-location schemes, it has been shown that 60 
the data processor must know the location of the source 
within a few centimeters to enhance the reception of signals. 
Therefore, these systems require precise knowledge of the 
position of the source, and precise knowledge of the position 
of the sensors. As a consequence, these systems require both 65 
that the sensor elements in the array have a known and stadc 
spatial distribution and that the signal source remains sta- 
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tionary relative to the sensor array. Furthermore, these 
beamforming systems require a first step for determining the 
talker position and a second step for aiming dae sensor array 
based on the expected position of the talker. 

Other techniques for determining the direction for aiming 
the sensor array rely on a priori information regarding the 
signal waveform and the signal radiation pattern. For 
example, radar systems use beamforming to transmit signals 
in a select direction. If an object is present in that direction, 
die signal reflects off the object and travels back toward the 
radar system. Therefore, the radar system is transmitting and 
receiving very similar signals. Furthermore, the data pro- 
cessor assumes that the objects are sufBciently distant from 
the sensor array that the incoming signals have a particular 
radiation pattern. The assumed radiation pattern can be a 
particulariy simple pattern that reduces the complexity of the 
time delay computation. 

Tlie radar system capitalizes on the similarity of the 
transmitted and received signals by using signals that have 
feanires which facilitate signal processing. The data proces- 
sor can directiy compare the features of the received signal 
against the features of the transmitted signal and determine 
differences between the two signals that relate to the relative 
time delays between each sensor. Furthermore, the radar 
system can use the assumptions regarding the radiation 
pattern of the incoming signals to simplify the signal pro- 
cessing techniques necessary to calculate the time delays. 
The data processor then compensates for the respective time 
delays between each sensor element to aim the sensor array 
in the direction of the object 

Although diese systems woric well if the signal waveform 
is known, these systems less effective where die a priori 
information regarding the signal waveform is unavailable or 
insufficient to allow the received signals to be compared 
against a known signal waveform. Therefore, these systems 
are generally limited to active systems that both transnut and 
receive signals. Furthermore, these systems are less effective 
when assumptions regarding the radiation pattern cannot be 
made. Therefore, these systems are usually limited to those 
applications where the signal source is sufficiendy distant 
from the sensor array that a signal pattern can be assumed. 

A known technique for determining the direction of 
incoming signals without a priori information employs cor- 
relation strategies that compare signals received by the array 
at spatially distinct sensors to estimate the time delays 
between the sensors. The time delay information, along with 
assumptions about the radiation pattern, are used to estimate 
the location of die signal source. One example of correlation 
strategies for locating talker position with a microphone 
array in a near-field environment is set forth in Silverman et 
al,, A Two-Stage Algorithm for Determining Talker Location 
from Linear Microphone Array Data, Computer Speech and 
Language, at 129-152 (1992). In general, the cross-corre- 
lation function of two signals received at two distinct 
sensors is computed and filtered in some optimal sense. The 
data processor includes a peak detector that detects the 
maximum value of the filtered signal. While the filtering 
criteria and the methods used for peak detection may vary 
considerably, these techniques are all based on maximizing 
the correlation between two received signals and determin- 
ing from die detected peak the relative time delays between 
the associated sensors. Once the time delays are determined, 
techniques, such as triangulation, can be used to determine 
the location of the signal source. 

Although these systems can work well, diere is generally 
a trade-off between the accuracy of the time delay estimate 
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and the compulational expense incurred by the procedure. 
Furthermore, there can be a tradeoff between the accuracy of 
the delay estimate and the rate at which the system can 
acquire the incoming signals. The cross-correlation function 
is a computationally intensive operation, and the accuracy of 5 
the peak data increases with the number of comparisons 
made during the correlation. In order to achieve a peak that 
is sufBciently accurate and well defined to identify precisely 
the position of the source, the computational burden can be 
prohibitive. Therefore, these systems can fail to produce the lO 
desired accuracy and update rate required for effective 
beamforming in a real-time environment. 

In view of the foregoing, an object of the present inven- 
tion is to provide improved signal processing methods and 
systems for combining a plurality of signals, and more 
particularly, to provide improved systems and methods for 
beamforming that dynamically determine the time delay 
estimates for a sensor array as part of the beamforming 
process. 

A further object of the present invention is to provide ^ 
systems and methods for real-time beamforming without the 
need of a priori information about the position of the signal 
source or knowledge of the signal radiation pattern. 

Another object of the present invention is to provide ^ 
signal processing systems and methods for adaptively aim- 
ing an array of sensor elements at a moving signal source. 

A yet further object of the present invention is to provide 
signal processing systems and methods that can dynamically 
compensate for a sensor array that has a non-uniform or 30 
unknown spatial distribution of sensors. 

A still further object of the present invention is to provide 
systems and methods for real-time beamforming without the 
need of a priori information about the signal waveform. 

Still another object of the present invention is to provide 3^ 
computationally e£5cient systems and methods to determine 
the relative time delays between the signals received by the 
sensor elements of a sensor array and employ these delay 
estimates for computationally efiQcient beamforming and 
source location. 40 

These and other objects of the invention are evident in the 
sections that follow. 



SUMMARY OF THE INVENTION « 

The aforementioned objects are obtained by the present 
invention which provides in one aspect an adaptive beam- 
forming apparatus which operates to combine a plurality of 
frequency-dependent signals to enhance the reception of 50 
signals ^m a signal source located at a select location 
relative to the apparatus. 

In one embodiment, the beamforming apparatus connects 
to an array of sensors, e.g. microphones, that can detect 
signals generated from a signal source, such as the voice of 55 
a talker. The sensors can be spatially distributed in a linear, 
a two-dimensional array or a three-dimensional array, with 
a uniform or non-uniform spacing between sensors. In a 
typical practice, the sensor array can be mounted on a wall 
or a podium and the talker is free to move relative to the 60 
sensor array. Each sensor detects the voice audio signals of 
the talker and generates electrical response signals that 
represent these audio signals. The adaptive beamforming 
apparatus provides a signal processor that can dynamically 
determine the relative time delay between each of the audio 65 
signals detected by the sensors. Further, the signal processor 
includes a phase alignment element that uses the time delays 
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to align the frequency components of the audio signals. The 
signal processor has a summation element that adds together 
the aligned audio signals to increase the quality of the 
desired audio source while simultaneously attenuating 
sources having different delays relative to the sensor array. 
Because the relative time delays for a signal relate to the 
position of the signal source relative to the sensor array, the 
beamforming apparatus provides, in one aspect, a system 
that "aims" the sensor array at the talker to enhance the 
reception of signals generated at the location of the talker 
and to diminish the energy of signals generated at locations 
different from that of the desired talker* s location. 

A beamforming apparatus constructed according to the 
present invention can include a signal processor that deter- 
mines the relative time delay between a plurality of fre- 
quency-dependent signals. The signal processor can store 
one frequency-dependent signal as a reference signal and 
can align the remaining frequency-dependent signals rela- 
tive to this reference signal. The reference charmel can 
include a memory for storing one of the frequency depen- 
dent signals as a reference signal having a user selected 
phase angle. The reference charmel can coimect to a plurality 
of alignment charmels, where each alignment charmel 
couples to a respective one of the frequency-dependent 
signals. The alignment channels can operate to adjust the 
phase angle of each of the frequency-dependent signals in 
order to align the signals relative to the reference signal. 
Each aligmnent channel can have a phase difference esti- 
mator that generates a delay signal which represents the time 
delay between the reference signal and the respective signal 
cormected to the aligrmient charmel. The alignment channel 
can also include a phase aligimient element that generates an 
output signal as a function of the delay signal, which has a 
magnitude that represents the magnitude of the respective 
signal and a phase angle that is adjusted into a select phase 
relationship with the reference signal. The signal processor 
can further include a summation element that couples to the 
aligimient chaimels and to the reference channel. The sum- 
mation element can generate a beam signal by summing the 
output signals with the reference signal. 

The adaptive beamforming apparatus can include an array 
of spatially distributed sensor elements for generating the 
plurality of frequency-dependent signals. The sensor ele- 
ments can be any one of a number of different types of 
elements capable of detecting a signal. Examples of such 
sensor elements include antennas, microphones, sonar trans- 
ducers and various other transducers capable of detecting a 
propagating signal and transmitting the signal to the signal 
processor. 

The sensor elements are spatially distributed to form an 
array for detecting a signal. Each sensor in the array can 
generate a single signal that represents the signal detected at 
that sensor element as a fimction of time. The spatial 
distribution of sensor elements can be unknown or non- 
uniform. The invention can be practiced with a linear array, 
a two dimensional array, or a three dimensional array. 

In one embodiment of the invention, the reference chan- 
nel of the signal processor can connect to the phase differ- 
ence estimator of each alignment charmel. In this practice, 
the phase difference estimator includes a memory for storing 
the reference signal and for storing the respective frequency- 
dependent signal associated with the respective aligrmient 
charmel. The phase difference estimator has a processing 
means to generate the delay signal as a function of the 
reference signal and the respective frequency-dependent 
signal. 

In an alternative embodiment, the signal processor can 
include interconnected alignment channels that determine 
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the relative lime delay between spatially adjacent sensors. In 
this practice, the phase difference estimator can include a 
memory for storing the respective frequency-dependent sig- 
nal of the associated alignment channel and the respective 
frequency-dependent signal of the second alignment chan- 5 
ncL The memory can further store the delay signal of the 
second alignment channel. The phase difference estimator 
can include a summing element that generates a delay signal 
as a function of the signal associated with the respective 
alignment channel and delay signal of the second alignment 
channel. 

In an alternative embodiment of the invention the signal 
processor can include a weighting element, that can increase 
or decrease the magnitude component of selected output 
signals. The weighting element can be a weighted averaging 
clement that can affect the magnitudes of the output as a 
function of the number of output signals summed together. 

In a further alternative embodiment of the present inven- 
tion, an error detector is associated with each of the delay 
estimators and determines from the delay signals and the 20 
frequency-dependent signals, an error signal that represents 
the accuracy of the delay signals. The error signal can be 
used by the weighted averaging element to determine which 
of the output signals has an associated error signal that is 
larger than a user-selected error parameter. The summation 25 
means can effect the weighting of that output signal respon- 
sive to the error signal, including deleting that output signal 
from the signal sununation. 

In another further embodiment of the invention, the delay 
estimator generates a delay signal that represents the time 30 
delay between a reference signal and a respective one of the 
frequency dependent signals, by measuring the difference 
between the phase angle components to the frequency- 
dependent signals. In one embodiment the delay estimator 
measures the difference in phase angles between the refer- 35 
cncc signal and the respective frequency-dependent signal of 
that alignment channel. The delay estimator can calculate 
from the differences in phase angles and from the frequency 
associated with each phase angles, the relative phase shift 
between the two signals. In" one embodiment of the inven- 40 
tion, the delay estimator can further include a weighting 
system that multiplies the difference in phase angles of each 
frequency component of two respective signals, by the 
magnitude of that frequency component. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The foregoing and other aspects of the invention may be 
more fully understood from the following description, when 
read together with the accompanying drawings in which like 
reference number indicate like parts in the several figures, 50 
and in which: 

FIG. 1 illustrates a schematic block diagram of one 
embodiment of a beamforming apparatus constructed 
according to the present invention; 

FIG. 2 illustrates a schematic block diagram of one 
alignment channel of the beamforming apparatus depicted in 

na 1; 

FIG. 3 illustrates an alternative embodiment of a beam- 
forming apparatus constructed according to the present ^ 
invention that includes phase difference estimators con- 
nected between spatially adjacent sensor elements; 

FIG. 4 illustrates the operation of a delay estimator that 
includes an unwrapping element for limiting spatial aliasing; 

FIG. 5 illustrates a further embodiment of the present 65 
invention that includes an orthogonal array of sensor ele- 
ments; 



FIG. 6 illustrates in more detail the orthogonal array of 
HG. 1. 

DETAILED DESCRIPTION 

FIG. 1 depicts an adaptive beamforming apparatus 10 
constructed in accord with the invention. The illustrated 
apparatus 10 includes a sensor array 12 and a signal pro- 
cessor 14. The sensor array 12 includes the sensors 16, 
sampling units 18, window filters 20 and time-to-frequency 
transform elements 22. The signal processor 14 includes a 
reference channel 24 and plural alignment channels 26. Each 
alignment channel 26 includes a phase difference estimator 
28. phase aligiunent element 30 and an optional weighting 
element 32. The illustrated system 10 further includes a 
sununation element 34 and a frequracy-to-time transform 
element 36. 

The illtistrated sensor array 12 includes a plurality of 
sensor elements 16. The sensors 16, in the depicted embodi- 
ment, are arranged to form a spatially distributed linear array 
of senson 16 each spaced apart by a distance X and arranged 
to receive input signals having signal components from a 
signal source, such as the target source 38. In the illustrated 
embodiment, each sensor 16 is the front end of an reception 
channel that includes a sampling unit 18, a window filter 20 
and a time-to-frequency transform element 22 all connected 
in electrical circuit Each of the illustrated reception chan- 
nels is a distinct subsystem of the sensor array 12 and can 
operate simultaneously with and independently from the 
other reception charmels. 

Each sensor 16 detects signals, including signals gener- 
ated from the target source 38, and generates an electrical 
response signal that includes a component that represents the 
signal generated frown the signal source 38. The sensors 16 
in the sensor array 12 can be microphones, antetmas, sonar 
phones or any other sensor capable of detecting a signal 
propagating from the source 38 and generating an electrical 
response signal that represents the detected signal. 

Each illustrated sampling element 18 is in electrical 
circuit with one sensor 16 and generates a digital response 
signal by sampling the electrical response signal generated 
by the associated sensor 16. The sampling element 18 can be 
a conventional analog-to-digital converter circuit of the type 
conunonly used to sample analog electrical signals and 
generate digital electrical signals that represent the sampled 
signal. The sampling element 18 generates samples of the 
electrical response signal at a rate, f„,^. selected according 
to the application of the beamforming apparatus 10. The 
sampling rate is generally determined according to the 
highest frequency component of the propagating signal of 
interest and according to the Nyquist rate. The sampling 
elements 18 are discussed in further detail below. 

The window filter 20 can be a conventional digital win- 
dow filter for selecting a discrete portion of a digital 
response signal. In the illustrated embodiment the window 
filter 20 is in electrical circuit with the output of the 
sampling element 18, and generates a finite length digital 
signal by truncating the digital signal generated by the 
sampling unit 18. In one embodiment, the window filter 20 
can be a rectangular window filter that tnmcates the digital 
signal to a user-selected number of samples to represent the 
input signal detected by sensor 16. Each discrete portion of 
the sampled signal is a frame of data that corresponds to the 
signal detected by the sensor 16 during a time period 
determined by the sampling rate and the number of samples 
present in the frame. The window filter 20 is discussed in 
further detail below. 
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In the depicted apparatus 10, the window filters 20 are in 
electrical circuit with the time-to-frequency transform ele- 
ments 22. Each lime-to-frequency transform element 22 can 
receive the data frames generated by filter 20 and transform 
each data frame into a frequency-dependent signal that 
represents the spectral content of the signal detected by the 
associated sensor 16 during the time period of the corre- 
sponding data frame. Each frequency-dependent signal can 
include a magnitude component, IRI, and a phase angle 
component, ((). for each frequency, con, in the spectral content 
of the transformed data frame. In one embodiment of the 
present invention, the frequency-dependent signals are 
stored in the apparatus 10 as complex arrays. Each complex 
array can include a storage ceU that corresponds to a 
predetermined frequency, and therefore can store the 
spectral contents of a data frame by filling the appropriate 
cell with the magnitude and phase angle of the correspond- 
ing frequency component in the spectral content of the data 
frame. For example: 
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can be a complex array that represents the spectral content 
of one frame of data, and has a first array, IRI, that represents 
the magnitude component of each frequency, ©„, and has a 
second array, <|», that represents the phase angle component 
of each frequency, to„. Other methods of storing or repre- 
senting frequency-dependent signals should be apparent to 
one of oidinary skill in the art of signal processing and do 
not depart for the scope of the invention. 

Therefore, the sensor array 12 generates from the target 
source 38 a plurality of frequency dependent signals, 
wherein each frequency-dependent signal is associated with 
one sensor 16, and represents the signal generated by target 
source 38, as 'Tieard", by the associated sensor 16. The 
time-to-frequency transform element 22 can be any of the 
commonly known signal processing techniques for effi- 
ciently computing the discrete fouricr transform of a time 
domain signal. In a preferred embodiment of the invention 
the time-to-frcquency transform element 22 is a Fast Fourier 
Transform clement that perfonns the discrete fourier trans- 
form on the window input signal generated by filler 20. It 
should be apparent to anyone of ordinary skill in the ait of 
signal processing, that any efficient algorithm for transform- 
ing the input signal from the time domain to the frequency- 
domain can be practiced with the illustrated system, without 
the parting from the scope of the present invention. 

The signal processor 14, constructed according to the 
invention, combines the input signals detected by the sensor 
array 12 and essentially "aims" the sensor array 12 at a 
signal source, e.g. source 38. The processor 14 "aims" the 
array 12 by generating a beam signal 66 that represents a 
combination of phase aligned input signals. The beam signal 
66 enhances, i.e. increases the signal-to-noise ratio, of 
signals generated from a source at the position of target 
source 38 relative to the sensor array 12. 

The signal processor 14 has a reference channel 24. plural 
alignment channels 26 and a summation element 34. The 
reference channel 24 connects to one input channel and 
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stores the frequency-dependent signal associated with that 
input channel in the memory element 40 as a reference 
signal 25. The phase angle components of the reference 
signal can be defined as in-phase relative to the phase angle 
5 components of the other frequency-dependent signals. Each 
alignment channel 26 generates an output signal 64 repre- 
senting the signal received at the associated sensor 16 phase 
aligned relative to the reference signal 25. The phase aligned 
signals are combined to form the beam signal 66. 
10 The illustrated signal processor 14 is in electrical circuit 
with the sensor array 12 and receives the frequency-depen- 
dent signals generated by the time-to-frequency elements 
22. The signal processor 14, depicted in FIG. 1, is repre- 
sented as circuit assemblies connected in electrical circuit. It 
1 5 should be apparent to one of ordinary skill in the art of signal 
processing that each circuit assembly depicted in FIG. 1 can 
be implemented as a software module and that the software 
modules can be similarly interconnected in a computer 
program to implement the signal processor 14 as an appli- 
20 cation program running on a conventional digital computer. 
The illustrated signal processor 14 includes a plurality of 
channels each connected to a respective one of the fre- 
quency-dependent signals. In the illustrated embodiment, 
the signal processor 14 includes a reference charmel 24 and 
25 a plurality of alignment channels 26. The reference channel 

24 has a storage element 40 for storing the reference signal 

25 that represent the input signal detected by one of the 
sensors 16. The memory 40 can store the reference signal 25 
as a complex array. The storage element 40 is in electrical 

30 circuit via the conducting element 42 to each of the align- 
ment channels 26. The conducting element 42 connects to 
each of the phase difference estimators 28 in the alignment 
channels 26. The phase difference estimator 28 of each of the 
alignment channels 26 has a second input 46 that is in 

35 electrical circuit with the output of a time-to-frequency 
transform element 22. 

With reference to FIG. 1 it can be seen that the aligrmient 
chaimels 26 of the illustrated signal processor 14 each 
connect to one time-to-ft^quency transform element 22. The 

40 phase difference estimator 28 of each alignment channel 26 
generates a delay signal 60 which approximates the time 
delay between the signal 25 detected by the sensor 16 
associated with the reference charmel 24 and the signal 
detected by the sensor 16 associated with alignment chaimel 

45 of the phase difference estimator 28. This estimated delay 
signal 60 can be generated by any of the conventional lime 
delay estimation techniques. These techniques can include 
cross-correlation algorithms with peak picking or frequency 
based delay estimators, including one preferred frequency 

50 based delay estimator that will described in greater herein- 
after. For those delay estimators that include correlation 
techniques that operate in the time-domain, the phase dif- 
ference estimator can include a frequency-to-time transform 
element to convert the magnitude and phase angle data of a 

55 data frame into a time dependent signal. A frequency-to-time 
transform element suitable for practice with the present 
innovation will be explained in greater detail herein after. 
However, any conventional domain transform algorithm or 
system can be practiced with the present invention without 

60 departing form the scope of the invention and such domain 
transform elements are considered within the ken of one of 
ordinary skill in the art of signal processing. 

As further depicted by FIG. 1, each alignment channel 26 
of the signal processor 14 includes a phase aligrunent 

65 element 30 that connects in electrical circuit via the con- 
ducting element 48 to the output of the phase difference 
estimator 28. The conducting element 48 carries the delay 
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signal 60 to the first input 50 of phase alignment element 30. 
A second input 52 of phase alignment element 30 connects 
to the respective frequency-dependent signal of the respec- 
tive input channel. As will be explained in greater detail 
hereinafter, the phase alignment element 30 can generate an 5 
output signal that is phase-aligned to the reference signal 25 
stored in storage element 40. 

The output signals 64 of the depicted signal processor 14 
are applied to optional weighting elements 32. The weight- 
ing element 32 can increase or decrease the magnitude of the 
output signal. Each of the weighting elements 32 generate a 
weighted output signal that connects to the summation 
element 34. The summation element 34 can sum together the 
weighted and phased aligned signals of each alignment 
channel and the weighted reference signal 25 of the refer- 
ence channel 24. The summation element 34 generates a 
beam signal 66. The beam signal 66 represents a combina- 
tion of phase aligned input signals that enhances, i.e. 
increases the gain, of signals generated from a source at the 
position of target source 38 relative to the sensor array 12. 

With reference to FIG. 2, the construction and operation 
of a signal processor 14 constructed according to the 
embodiment shown in FIG. 1 can be described. FIG. 2 
illustrates the reference chaimel 24, the memory element 40, 
a phase alignment chaimel 26, that includes a phase differ- 
ence estimator 28 and a phase alignment element 30. The ^ 
phase alignment element 30 and the memory element 40 are 
in electrical circuit to the summing element 34 that generates 
a signal transmitted over a conducting wire to the frequency- 
to-iime transform element 36. In the illustrated embodiment, 
the alignment chaimel 26, including the phase difference 
estimator 28 and the phase alignment element 30, aligns the 
frequency-dependent signal 68, transmitted via conducting 
element 42, to the reference signal 25, stored in the data 
memory 40. 

In a first step, the phase difference estimator 28, generates 
the delay signal 60 that represents the time delay between 
the reference signal 25 and the frequency-dependent signal 
68. In a second step, the phase alignment element 30, 
calculates, for each frequency component of the frequency- 
dependent signal 68, the phase shift: ^ 

k2(pi)tij;N 

for that frequency component caused by the time delay. The 
phase alignment element 30 can align each frequency com- 4S 
ponent of signal 68 as a function of the delay signal 60, t,,-, 
and the frequency, 2(pi)k/N, where N can be the FFT size, 
and k can represent the frequency component, via the 
addition of the corresponding shift as given in the formula 
above, to the phase angle of the frequency-dependent signal 50 
68. The phase aligrmient element 30 generates the output 
signal 64, that is aligned to the reference signal 25, and that 
can be represented as a complex array, including a magni- 
tude component and a phase angle component. In a final 
step, the aligned signal 68 and the reference signal 25 are 55 
combined by the sununing element 34 to generate the beam 
signal 66. 

The phase difference estimator 28 illustrated in FIG. 2 
includes the data memory 54. a phase angle subtractor 56 
and a delay estimator 58. The illustrated phase difference 60 
estimator 28 is a frequency-domain phase difference esti- 
mator that generates the delay signal 60 that represents the 
relative time delay between the reference signal 25 stored in 
data memory 40 and the signal 68 stored the data memory 
54. The illustrated data memory 54 provides storage for a 65 
complex array having a magnitude component RJT and phase 
angle component OJ. The data memory 54 is in electrical 
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circuit with the phase angle subtractor 56 that includes a data 
memory for storing the phase angle component, OI, of the 
reference signal 25 and for storing the phase angle compo- 
nent, OJ, of the signal 68 stored in data memory 54. The 
phase angle subtractor 56 generates a signal 62 that repre- 
sents the differences between the phase angles of the refer- 
ence signal 25 and the phase angles of the respective 
frequency-dependent signal associated with that alignment 
channel 26. The signal 62 can represent the phase angle 
difference as an array that has cells indexed by frequency. 
The difference signal 62 can be transmitted over a conduct- 
ing element to the delay estimator 58. In the illustrated 
embodiment the delay estimator 58, which will be explained 
in greater detail hereinafrer. generates the delay signal 60 as 
a function of the phase angle difference signal 62. 

The delay signal 60 cormects via a conducting element to 
the phase alignment element 30. As illustrated by FIG. 2, the 
phase aligrmient element 30 is in electrical circuit with 
conducting element 42 to receive the frequency-dependent 
signal 68 associated with the alignment channel 26. The 
phase alignment element 30 can include a phase shift 
element 69 that can generate a shifr signal representative of 
the phase shifts for each of the frequency components of the 
signal 68. The phase alignment element 30 can increment 
the phase angle flE>J of the associated frequency-dependent 
signal by the shift signal. In one embodiment of the present 
invention, the phase alignment element 30 can be a pro- 
granmiable arithmetic-logic-unit that multiplies the phase 
angle of the associated frequency-dependent signal with the 
corresponding phase shift signal. However, it should be 
obvious to one of ordinary skill in the art of signal process- 
ing that the phase aligrmnent element 30 can be implemented 
as a software module that includes programming structure 
for multiplying the phase angles of the signal 68 by the 
conespoiiding phase shift signals. 

As ftirther illustrated by FIG. 2, the output signal 64 is 
transmitted via a conducting element to the summation 
element 34 along with the reference signal 25 stored in data 
memory 40. The suinmadon element 34 generates a beam 
signal 66 that represents the summation of the aligned output 
signals 64 from each of the alignment chaimels 26 in the 
signal processor 14 and the reference signal 25 stored in data 
memory 40. The illustrated signal processor of FIG. 2 
includes an optional frequency-to-time transform 36 ele- 
ment that generates a time-dependent signal that represents 
the beam signal 66. In the illustrated embodiment the 
frequency-to-time domain transform element 36 is a inverse 
FFT of the type conventionally used to transform discrete 
signals from the time-domain to the frequency-domain. 

With reference to FIG. 3, one preferred embodiment of 
the present invention can be described. FIG. 3 depicts a 
beamforming apparatus 70 connected to a sensor array 12 
and a signal processor 78. The signal processor 78 includes 
a reference channel 24 that provides a data storage element 
40 for storing one frequency-dependent signal associated 
with one of the sensors 16 as a reference signal 25 that 
includes a magnitude component and a phase angle com- 
ponent. The phase angle component of the reference signal 
25 stored in the data memory 40 includes a phase angle 
corresponding to each one of the frequency components of 
the input signal detected by the sensor 16 associated with the 
reference channel 24. The phase angles of the reference 
signal 25 can represent a reference phase for that frequency 
component of the signal generated by the source 38. The 
storage element 40 generates an output signal that connects 
via a conducting element to the phase difference estimator 
28 of the first alignment channel 26. As can be seen with 
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reference lo FIG. 3, the alignment channel 26 includes a 
phase difference estimator 28 and phase alignment element 
30 constructed similarly lo the previously described embodi- 
ment. The system 70 further includes a plurality of align- 
ment channels 76 that include a phase difference estimator 5 
72, a summing element 74» and a phase alignment element 
30. The alignment channels 76 connect between two input 
channels of the sensor array 12. In the illustrated embodi- 
ment the alignment channels 76 preferably connect lo spa- 
tially adjacent sensors in the sensor array 12. 

In the illustrated embodiment of RG. 3, the phase differ- 
ence estimator 72 of each alignment channel 76 connects via 
conducting elements to the input channels of two spatially 
adjacent sensor elements to generate a delay signal 60 that 
represents the time delay between these two spatially adja- 
cent sensors 16. The alignment channel 76 further includes 
a summing element 74. The summing element 74 has a first 
input 80 that connects via a conducting element to the output 
of the phase difference estimator 72. The summing element 
74 has a second input 82 that connects via a conducting 
element to the delay signal of a phase difference estimator 20 
associated with a sensor 16 that is spatially adjacent. The 
summing element 74 generates an output signal that is 
connected via a conducting element to the phase alignment 
element 30. 

As can be described with reference to FIG. 3, the align- 25 
ment channel 26 calculates the time delay between the 
reference signal 25 and the frequency-dependent signal 
generated by the spatially adjacent sensor 88. A second 
alignment channel 76 calculates the time delay between the 
sensor 88 and the sensor 89. The summing element 74 of the 30 
alignment channel 76 connects between the channel 26 and 
the channel 76 and can add together the two time delays to 
generate a cumulative delay signal 86. The cumulative delay 
signal 86 represents the time delay between the sensor 16 of 
the reference channel 24 and the sensor 89 of the associated 35 
alignment channel 76. As illustrated, each siunming element 
74 of each alignment channel 76 adds the cumulative delay 
signal 86 to the delay signal 60 generated by the phase 
difference estimator 72. Therefore, the cumulative delay 
signal 86 references the each alignment channel 76 to the 40 
reference channel 24. 

The cumulative signal 86 generated by the summing 
element 74 represents the summed time delay between the 
reference signal 25 stored in data memory 40 and the 
frequency-dependent signal associated with the alignment 45 
channel 76. The phase alignment 30 phase shifts the asso- 
ciated frequency-dependent signal by the total time delay 
represented by the signal 86 of summing element 74. The 
phase shift added to each frequency component of the 
associated frequency-dependent signal ahgns the associated 50 
frequency-dependent signal to the reference signal 25 stored 
in data memory 40. The phase alignment element 30 gen- 
erates an output signal 64 representative of the associated 
frequency-dependent signal phase aligned with the reference 
signal 25 stored in data memory 40. The output signal of 55 
phase alignment element 30 is transmitted via a conducting 
element to the summing element 34. As previously 
described, the summing element 34 sums the output signals 
generated by the alignment channels 26 and 76 with the 
reference signal stored in data memory 40. The combined 60 
signals represents a beam signal 66 that can be transmitted 
by a conducting element to the optional frequency-to-time 
transform means 36. The optional frequency-to-lime trans- 
form element 36 can provide a output signal that represents 
the beam signal 66 as a time dependent signal. 65 

The invention will now be further described with refer- 
ence to one preferred embodiment that includes a frequency- 
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domain delay estimator 58 and a linear array of microphones 
16. The frequency-domain delay estimator 58 aims the 
sensor array 12 by dynamically determining the time delay 
between two frequency-dependent signals to maximize the 
power in the beam signal 66 formed by the summadon of the 
frequency-dependent signals. A signal processor 14 with this 
preferred frequency-domain delay estimator 58 is shown to 
be accurate over a wide range of signal-to-noise conditions 
and an effective basis for more complex acoustic-array 
applications, such as source detection and tracking proce- 
dures. Further, it is suitable for determining the time delay 
between wide-band frequency-dependent signals, where 
there is limited a priori knowledge of the spectral content of 
the signals. 

The sensor array 12 includes a linear array of eight 
microphone sensors 16 distributed at 16.5 cm intervals along 
one wall of a room. The input signals detected by the 
microphones 16 are digitized simultaneously at 20 kHz by 
sampling units 18 of eight distinct input charmels. The 20 
kHz sampled input signals are windowed by window filter 
elements 20 into finite sequences. For each sequence the 
DFT is computed by the associated time-to-frequency trans- 
form element 22 and converted to a magnitude-phase rep- 
resentation. The choice of the window filter 20 and the size 
as well as the DFT length vary with the particular applica- 
tion and computational availability. One preferred window 
filter 20 is a 512-point Harming window applied with zero 
padding for use with a 1024-point FFT as a dme to fre- 
quency transform element 22. The individual segments can 
be half-overlapping in time to facilitate reconstruction. 

For each pair of spatially consecutive microphones 16, the 
phase angle subtractor 56 calculates the phase angle differ- 
ences between corresponding frequencies and generates the 
signal 62, dy(k) . Each frequency component of the fre- 
quency-dependent signals can be represented by: 

where N is the DFT length, k=0, 1, . , . , N- 1, and is 
angular frequency. RQc) represents the spectral magnitude 
component of the frequency dependent signal. A phase delay 
signal 60, x^, is then computed according to the funcdon: 

tj(= 2 — 

where R(k) can represent, in one embodiment, the geometric 
mean of the magnitude components of the frequency-depen- 
dent signals, R^, R^. It should be obvious to one of ordinary 
skill in the art of signal processing that values for R(k) can 
be computed in using other statistical techniques, including 
determining the median of plural signals, weighted averag- 
ing and other techniques that can improve signal-to-noise 
rejection and error estimate. 

The frequency-domain delay esdmator 28 can include an 
optional unwrapping element 96. The imwrapping element 
96 is understood to resolve any spatial aliasing in the delay 
signal 60. In one embodiment the delay estimator 28 
includes an unwrapping element 96 that can generates the 
delay signal 60, in three iterations, each of which 
generates an increasingly accurate estimate of the time delay 
between the signals. The accuracy of the delay estimate is 
understood to depend upon the limits of simimadon in the 
above equation. In general, the delay estimate tends to 
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converge upon the true delay more precisely as the number 
of terms in the summation is increased. Therefore, it is 
preferred to sum over k=0. where 
corresponds to the highest frequency of interest. For speech, 
a reasonable cutoff can be 5.4 kHz with 5 

However, the 2 Ttm phase ambiguity in the delay signal 60 
can restrict the region in which the phase angle difference 
signal 62 is understood to vary in a linear fashion and 
therefore limits the upperbound limit of the summation 
index. One preferred unwrapping element 96 generates a 
delay signal 60 by providing two initial estimates of the 
delay signal 60. 

The unwrapping element 96 can generate an initial esti- 
mate for the delay signal 60, t^j, by deterring a first 
frequency range over which spatial aliasing is understood 
not to occur. The first range, K, is determined by: 

20 

where c is the propagation speed of the input signals, and 
lmy-m,.| represents the spatial distance between the micro- 25 
phones 16. The minimum of the two solutions can be used 
for K. 

The unwrapping element 96 can generate a second esti- 
mate of the delay signal 60 by computing the delay signal 60 
over the range determined by: 30 

The error term, e, can be included in the above expression 35 
to compensate for the inaccuracy of the initial estimate of the 
delay signal 60, t^j. Nominal values for e range from 0.5 to 
2 samples, depending on the expected accuracy of the initial 
estimate. 

In a third iteration, the unwrapping element 96 uses the 40 
second estimate of the delay signal 60, x^, to unwrap the 
phase angle difference signal 62, d;.(k). and then a final 
estimate for the delay signal 60 can be computed over the 
entire frequency range of interest (K=K^). The phase 
angle differences in signal 62 should vary linearly in fre- 45 
qucncy with variations in linearity due to additive noise in 
the sensor signal. The delay estimator 58 can examine the 
phase angle differences as a function of frequency, and given 
the second estimate of the delay signal 60, unwrap the phase 
differences that evidence a 27tm phase ambiguity. It is 50 
preferred that the unwrapping depend upon an accurate 
estimate of 71^, which is typically not available until the end 
of the second iteration. 

The iterative procedure of the unwr^ping element 96 is 
illustrated in FIQ, 4. The upper graph is a plot of spectral 55 
magnitudes in dB for the fiequency-dependent signal, the 
middle graph displays the original phase angle difference 
signal 62 used for the first two iterations, and the bottom 
graph is the unwrapped phase angle difference signal 62 
applied in the final iteration of the algorithm. In each case, 60 
the horizontal axis is the first 275 points of the DFT. 
corresponding to 0 through 5.4 ItHz, In tiie initial stage, 
K^,=53, which when used as the upper bound of the 
summations for the initial estimate of delay signal 60, and 
generates a time delay in samples of T,yi=1.513 samples. 65 
This estimate of the delay signal 60 is then used to calculate 
the range of summation for the second iteration. Using an 
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error term e=1.5 samples. K^\69 and the second delay 
estimate for signal 60 is found to be T^=2.579 samples. The 
delay signal 60 may be viewed as the slope of the line tiiat 
fits these points in a weighted mean squared sense. In the 
second graph, the phase wrapping ambiguity is apparent and 
the graph does not appear to be linear. In the third iteration, 
the phase differences in the signal 62 are unwrapped by the 
unwrapping element 96 and plotted on the lower graph. The 
unwrapping algorithm places each phase angle difference 
within 71 radians of the slope line by adding/subtracting 
integer multiples of 27i. The dotted lines in the lower graph 
represent the boundaries of the tmwrapping algorithm. The 
final delay signal 60, t,-^, is then calculated with the 
unwrapped phase angle difference signal 62, d,^) . over the 
entire frequency range (k=0,l, . . . , k^^J. 

The frequency -domain delay estimator has several advan- 
tages over its time-domain counterpart. It is computationally 
simple, does not necessitate the use of search methods, and 
has precision independent of sampling rate. 

With reference again to FIG. 1, a further embodiment of 
the present invention, that includes an error detection ele- 
ment 100 can be described. The delay estimator 28 of FIG. 
1 includes an optional error detection unit 100 that is in 
electrical circuit the weighting element 32. The error detec- 
tion unit 100 can generate an error signal 102 that represents 
the accuracy of the delay signal 60 generated by the phase 
difference estimator 28. In one preferred embodiment of the 
invention, the weighting element 32 can affect the weighting 
of the aligned output signal 64 responsive to the error signal 
102. The weighting element 32 can include a user-selected 
error parameter. The weighting element 32 can compare the 
generated error signal 102 with the user-selected error 
parameter and generate a weighting parameter for the asso- 
ciated output signal 64 as a function of the error signal 102 
and the user-selected error parameter. 

In one preferred embodiment of the error detection unit 
100, the detection imit 100 includes a data processor that 
generates the error signal 102 as a function of the phase 
angle difference signal 62 and the magnitude components of 
the frequency dependent signal. In one example the error 
signal 102 is computed from: 

2 

J«(«I'((^)t,-4(*)) 

EirorAtmn(Tv) = 

The error signal 102 can provide a useful means for 
evaluating the significance of a delay signal 60. A relatively 
large error signal 102 can indicate that the predicted delay 
signal 60 is inaccurate, as would be expected during times 
when there are no input signals in-coming to the sensor array 
12. A small value can demonstrate that the delay signal 62 
is a good measure of the relative time delay between tiie 
sensors 16. 

In one embodiment, a normalized venion of this error 
signal 102 can be calculated and compared to a user-selected 
parameter that represents an environmentally dependent 
threshold to determine if the delay signal 60 is valid. 
Environmentally dependent factors can include background 
noise, deviations between sensor performance and other 
simUar factors. 

In another preferred embodiment, the error detection unit 
100 generates a signal that represents die geometric mean of 
the individua l magnitudes of the frequency-dependent sig- 
nals, IR(k)NVIRi(k)llR|i[k)I, and uses this mean to compute 
the error signal 102. This preferred embodiment is under- 
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stood to be more resistant to noise and gain differences 
between the sensors 16. 

In a further embodiment of the present invention, depicted 
in FIG. 5, a beamforming apparatus 98 according to the 
invention can be constructed having an orthogonal array 90 5 
of sensor elements 16. The beamforming apparatus 98 
according to this embodiment of the invention determines 
the position of target source 38 through a series of triangu- 
lation calculations which require knowledge of the signal's 
relative delay when projecting onto a pair of microphone lO 
receivers. 

The beamforming apparatus 98 can include the orthogo- 
nal array 90, and a signal processor 114. The orthogonal 
array 90 can include a plurality of sensor elements 16 each 
connected to an input channel that includes a sampling unit 15 
18, a window filter 20 and a time-to-frequency transform 
element 22. The signal processor 114 can include a reference 
channel 24 and plural alignment channels 26. Each align- 
ment channel 26 includes a phase difference estimator 28, 
phase alignment element 30 and an optional weighting 20 
element 32. The signal processor 114 can further include a 
source locator unit 116, in electrical circuit with each of the 
phase difference estimators 28, a summation element 34 in 
electrical circuit with each of the phase alignment elements 
30 and a frequency-to-time transform element 36 in electri- 25 
cal circuit with the summation element 34. As will be 
explained in greater detail hereinafter, the source locator unit 
116 generates an output signal 120 that represents the 
location of the detected source, e.g., source 38, relative to 
the sensor array 90. 30 

FIG. 6 illustrates the orthogonal array 90 that includes 
* sensor elements 16 distributed in two independent arrays 
including a horizontal array 94 and a vertical array 92. An 
orthogonal array is prefenid for its stability in evaluating 
both the X and y positions although other transverse array 35 
configurations can be practiced with the present invention. 
Further, it should be apparent to one of ordinary skill in the 
art of signal processing that the array 90 can include third 
array of sensors 16 disposed above or below the plane 
formed by the orthogonally arranged arrays 92 and 94. The 40 
third array can configured into the system in the manner of 
arrays 92 and 94 and can yield time delay information, 
related to a third dimension, or coordinate of the source 38, 
for example height. 

While either linear array 92 or 94 may be used to evaluate 45 
both the X and y coordinates of the source position, the 
triangulation procedure is understood to be most effective if 
position coordinates are determined by the array in the 
direction normal to the source. For example, using only the 
sensors 16 in the array 94 is effective for evaluating the 50 
x-coordinate of the source location 38, but not as accurate at 
finding the y-coordinate. By combining both axes in the 
triangulation procedure, the estimate is equally sensitive in 
either direction. 

Each sensor 16 detects signals, including signals gener- 55 
ated from the target source 38, and generates an electrical 
response signal that includes a component that represents the 
signal generated from the signal source 38. The sensors 16 
sensor array 90 can be microphones, antennas, sonar phones 
or any other sensor capable of detecting a propagating signal 60 
and generating an electrical response signal that represents 
the detected signal. 

The source locator 116 can generate the position signal 
120 that represents the position of the source 38 relative to 
the sensor array 90. In one preferred embodiment of the 65 
source locator 116, at least four phase difference estimators 
28 transmit delay signals 60 to the source locator 116. 
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Preferably the delay signals 60 transmitted to the source 
locator 116 represent the time delay between two spatially 
adjacent sensors 16 in array 94 and two spatially adjacent 
sensors 16 in array 92. With reference to FIG. 6, the 
generation of position signal 120 can be explained. Given 
four sensors 16, one pair on the x-axis array 92 at positions 
xl and x2 and another pair on the y-axis array 94 at yl and 
y2, the curves Px and Py represent the loci of points pxePx 
and pyePy such that: 

where 5^ and 5^ are constants such at I5j^lx2-xll and 
I6yl^y2-yll, The curve Px can be interpreted as the set of 
locations which produce the same relative delay between xl 
and x2. This relative delay, represented by the delay signal 
60, (in samples) can be related to 5^ by the following 
relation: 




Where f^,^ is the sampling rate of the sampling elements 18. 
Py and 5^ may be regarded similarly with respect to the 
sensors 16 on the y-axis array 94. 

The intersection of Px and Py represents a unique source 
location that produces relative delay signals 60, T^^ V 
between the respective sensor 16 pairs. The source locator 
unit 116 can generate the position signal 120 by estimating 
the relative delays at each sensor pair, and generating the 
curves Px and Py and find their intersection. Given that Px 
and Py represent one half of the hyperbolas, the intersection 
of Px and Py may be solved for algebraically. The simulta- 
neous solution of the hyperbola equations reduces to finding 
the roots of a fourth order polynomial. From these four roots, 
the real root which corresponds to the actual coordinate pair 
(x,y) of the source location can be identified. This is can be 
accomplished by noting that the four intersection points of 
these two hyperbolas are each located in a distinct quadrant 
of the x-y plane. These four quadrants are demarcated by the 
lines y=<yl+y2)/2 and x=(xl+x2)/2. The proper quadrant 
may be chosen directly from the signs of the 5^^. and 5^ terms. 

In one preferred embodiment of the source locator 116, 
the locator 116 can select which sensor pairs and delay 
signals 60 to use to generate the position signal 120. For 
eight sensors 16 there are 28 subsets of two which corre- 
sponds to 28^784 combinations of the x-y axes sensor 
pairs. The first restriction imposed is to consider only pain 
of sensors 16 that are spatially contiguous. The second 
constraint is to consider only those delay signals 60 with an 
associated normalized error less than a certain threshold. 
The error signal 102 of each error unit 100 can be trans- 
mitted by a conducting element to the source locator 116. 
The source locator can compare the error signal 102 against 
a user-selected error parameter. If the comparison indicates 
a large error, then that indicates that the delay signal 60 is 
either inaccurate, the single source model does not apply, or 
this is a region of silence. In the first two cases the position 
signal 120 generated by the source locator 116 is a low 
quality estimate of the position. In the final case the position 
signal is meaningless as a position signal 120 but does 
indicate the presence of a signal source 38. 

In the preferred embodiment, the source locator 116 
cormects to each delay estimator 28 and, for each array 92 
and 94. collects the delay signals 60 and corresponding error 
signal 102 for each set of sensor pairs with less than a 
user-selected error-threshold. The source locator 116 orders 
each set by increasing normalized error as represented by the 
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error signal 102. If either sci is empty then no position signal 
120 is generated. If either set has sensor pairs with error 
signals 102 below the user-selected error parameter, then the 
source locator 116 generates a position signal for a user- 
selected number of sensor pairs. The position signal 120 can 
be generated as the mean of several position estimates. 

The source locator unit 116 can be a conventional elec- 
trical circuit card that includes arithmetic and logic circuits 
for generating from delay signals 60 of the phase difference 
estimators 28, a position signal that represents the position 
of the source 38 relative to the sensor array 90. The source 
locator unit 116 can also be a conventional data processor, 
such as a engineering workstation of the type sold by the 
SUN Corporation, having an application program for gen- 
erating from the delay signals 60 of the phase difference 
estimators 28, a position signal that represents the position 
of the source 38 relative to the sensor array 90. 

Described above are improved methods and apparatus for 
combining a plurality of signals to generate a beam signal 20 
for enhancing the reception of signals at a select position 
relative to an array of sensor elements. The invention has 
been described with reference to preferred, but optional, 
embodiments of the invention that achieve the objects of the 
invention set forth above. 

Thus, for example, a steerable array of microphones has 
been described that has the potential to replace the tradi- 
tional microphone as the input transducer system of speech 
data. An array of microphones has a number of advantages 
over a single-microphone system. It may be electronically 
aimed to provide a high-quality signal from a desired source 
location while it simultaneously attenuates interfering talk- 
ers and ambient noise. In this regard, an array has the ability 
to outperform a single, highly-directional microphone. An 
array system does not necessitate local placement of trans- 
ducers, will not encumber the talker with a hand-held or 
head-mounted microphone, and does not require physical 
movement to alter its direction of reception. These features 
make it advantageous in settings involving multiple or 
moving sources. Furthermore, it is capable of activities that 
a single microphone cannot perform, namely the automatic 
detection, location, and tracking of active talkers in its 
reception region. Existing array systems have been used in 
a number of applications. These include teleconferencing, 
speech recognition, speech acquisition in an automobile 45 
environment, large-room recording-conferencing, and hear- 
ing aid devices. These systems also have the potential to be 
beneficial in several of other environments, the performing 
arts and sporting communities, for instance. 

The above described embodiments have been set forth to so 
describe more completely and concretely tiie present inven- 
tion, and are not to be construed as limiting the invention. 
Thus, for example, the invention can be practiced as a radar 
system having two dimensional array of antenna elements 
disposed at non-uniform spacing in an plane. The array can 55 
couple to a signal processor constructed according to the 
present invention, that can align each of the signals received 
by the antenna relative to each other. Additionally, the radar 
system can include a source locator unit that determines 
from the relative time delays between the antennas the 
position of the source relative to the antenna array. 

It is further intended that all matter and the description 
and drawings be interpreted as illustrative and not in a 
limiting sense. That is, while various embodiments of the 
invention have been described in detail, other alterations 
which will be apparent to those skilled in the art are intended 
to be embraced within the spirit and scope of the invention. 
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In view of the foregoing, what is claimed is: 

1. Signal processing apparatus for combining a plurality 
of frequency-dependent signals wherein each frequency- 
dependent signal has a magnitude component and a phase 
angle component, said apparatus comprising 

reference means for defining one of said frequency- 
dependent signals as a reference signal having a user- 
selected phase angle, 
a plurality of alignment means, each coupled to a respec- 
tive one of said frequency-dependent signals, for 
adjusting the phase angles of said signals relative to 
said reference signal, said alignment means having 
phase difference estimator means for generating a delay 
signal representative of a time delay between said 
reference signal and said frequency-dependent sig- 
nal, and 

phase alignment means for generating, as a function of 
said delay signal, an output signal having a magiu- 
tude component representative of the magnitude 
component of said frequency-dependent signal and 
having a phase angle component adjusted to a selea 
phase relationship with said reference signal, and 

summation means, coupled to said plurality of align- 
ment means for summing together said phase aligned 
output signals to generate a beam signal. 

2. Apparatus according to claim 1 further comprising 
means for generating said plurality of frequency-depen- 
dent signals, said means including 

an array of spatially distributed sensor elements, 
wherein each sensor element includes means for 
detecting a signal and generating a respective one of 
said plural frequency-dependent signals to represent 
said signal detected at said spatially distributed sen- 
sor element 

3. Apparatus according to claim 2 wherein 

said array includes a linear array of spatially distributed 
sensor elements. 

4. Apparatus according to claim 2 wherein 

said array includes a two-dimensional array of spatially 
distributed sensor elements. 

5. Apparatus according to claim 2 wherein 

said array includes a three-dimensional array of spatially 
distributed sensor elements. 

6. Apparatus according to claim 1 wherein said phase 
difference estimator means includes 

means for generating said delay signal as a function of 
said reference signal and said respective one of said 
frequency-dependent signal. 

7. Apparatus according to claim 1 wherein said phase 
difference estimator means couples to a delay signal of a 
second alignment means and includes 

summing means for summing said delay signals to gen- 
erate a signal representative of die time delay between 
said respective one of said frequency-dependent signal 
and said reference signal. . 

8. A signal processing apparatus according to claim 1 
further comprising 

weighting means, connected to one or more phase align- 
ment means, for increasing or decreasing the magni- 
tude component of each of said output signals. 

9. A signal processing apparatus according to claim 1 
further comprising 

weighted averaging means, connected to at least a portion 
of said phase alignment means, for increasing or 
decreasing the magnitude component of said output 
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signals as a function of a normalizing factor represen- 
tative of the number of output signals summed together. 

10. Signal processing apparatus for combining a plurality 
of frequency-dependent signals wherein each frequency- 
dependent signal has a magnitude component and a fire- 5 
quency component, said apparatus comprising 

reference means for defining one of said frequency- 
dependent signals as a reference signal having a user- 
selected phase angle, 

a plurality of alignment means, each coupled to a respec- 10 
tive one of said frequency-dependent signals, for 
adjusting the phase angles of said frequency-dependent 
signals relative to said reference signal, said alignment 
means having 

storage means for storing a magnitude component and 15 
a phase angle component of said frequency-depen- 
dent signal, 

delay estimator means for generating, as a function of 
the difference in phase angles of two firequency- 
dependent signals, a delay signal representative of a 
time delay between said reference signal and said 
frequency-dependent signal, and 

phase alignment means for generating as a function of 
said delay signal, an output signal having a magni- 
tude component representative of the magnitude 
component of said frequency-dependent signal and ^ 
having a phase angle adjusted to a select phase 
relationship with said reference signal, and 

summation means, coupled to said plurality of align- 
ment means and having means for summing fre- 
quency-dependent signals, for generating a beam 30 
signal representative of a summation of said output 
signals. 

11. A signal processing apparatus according to claim 10 
wherein said delay esdmator includes weighting means for 
generating as a function of said magnitude components of 35 
said frequency-dependent signal, said difference in phase 
angles. 

12. A signal processing apparatus according to claim 10 
further including 

error detection means for generating, as a function of said ^ 
delay signal and said phase angle component of said 
frequency-dependent signal, an error signal represen- 
tative of the accuracy of said delay signal. 

13. A signal processing apparatus according to claim 12 
wherein said summation means includes means for moni- 
toring said error signal to adjust said beam signal responsive 
to an error signal larger than a user-selected error-parameter. 

14. A signal processing apparatus according to claim 12 
further comprising 

means for generating said error signal as a function of the 
geometric mean of the magnitude components of two 
frequency-dependent signals. 

15. A beamforming apparatus for combining a plurality of 
frequency-dependent signals wherein each frequency-de- 
pendent signal has a magnimde component and a phase 
angle component comprising 

means for generating said plurality of frequency-depen- 
dent signals, having an array of spatially distributed 
sensor elements, wherein each sensor element includes 50 
u^sducer means for detecting a signal and for gener- 
ating a respective one of said plural signals to represent 
said signal detected at said spatially distributed sensor 
element, 

reference means for storing one of said frequency -depen- 65 
dent signals as a reference signal having a user-selected 
phase angle. 
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a plurality of alignment means, each coupled to a respec- 
tive one of said frequency-dependent signals, for 
adjusting the phase angle components of said fre- 
quency-dependent signals relative to said reference 
signal, said alignment means having 
storage means for storing said magnitude component 
and said phase angle component of said frequency- 
dependent signal, 
delay estimator means for generating, as a function of 
the difference in phase angles of two frequency- 
dependent signals, a delay signal representative of a 
time delay between said reference signal and said 
frequency-dependent signal, and 
phase alignment means for generating as a function of 
said delay signal, an output signal having a magni- 
tude component representative of the magnitude 
component of said frequency-dependent signal and 
having a phase angle component adjusted to a select 
phase relationship with said reference signal, and 
summation means, coupled to said plurality of align- 
ment means and having means for summing fre- 
quency-dependent signals, for generating a beam 
signal representative of a combination of said output 
signals. 

16. Apparatus according to claim 15 wherein 

said array includes a linear array of spatially distributed 
sensor elements and said detection means includes 
means for detecting audio signals. 

17. Apparatus according to claim 15 wherein 

said array includes a linear array of spatially distributed 
microphones of the type amenable for detecting audio 
signals. 

18. Apparatus according to claim 15 wherein 

said array includes digital conversion means, coupled to 
each of said sensor elements, for generating said 
respective signal as digital electrical signal. 

19. Apparams according to claim 18 wherein 

said array includes window filter means, coupled to each 
of said sensor elements, for generating said respective 
signal to represent a discrete portion of said digital 
electrical signal. 

20. Apparatus according to claim 18 wherein 

said array includes a 512 point harming window filter 
means, coupled to each of said sensor elements, for 
generating said respective signal to represent a 512 
point portion of said digital electrical signal. 

21. Apparatus according to claim 15 wherein said array 
further comprises 

time-to-frequency transform means, coupled to each of 
said sensor elements, for generating said respective 
signal as a frequency-dependent representation of said 
detected signal. 

22. Apparatus according to claim 21 wherein said fre- 
quency transform means includes 

fast fourier transform means for generating a plurality of 
fourier coefScients representative of at least a portion 
of the spectral content of said detected signal. 

23. Apparatus according to claim 15 wherein said delay 
estimator further comprises 

spatial aliasing filter means for generating said delay 
signal as a function of the spatial distribution of said 
sensor elements. 

24. Apparatus according to claim 15 where in said sum- 
mation means further comprises 

frequency-to-time transform means, coupled to said sig- 
nal summation means, for generating said beam signal 
as a time-dependent signal. 
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25. Apparatus according to claim 15 wherein 
said array of spatially distributed sensor elements has a 
first array of sensor elements spatially distributed rela- 
tive to a first axis and a second array of sensor elements 
spatially distributed relative to a second axis extending ^ 
transversely to said first axis, 
said reference means has means for storing a first refer- 
ence signal and a second reference signal representative 
of frequency magnitudes and phase angles of one of 
said frequency-dependent signals generated by said ^ 
first array and said second array respectively, and 



22 

said delay estimator means has means for generating, a 
first delay signal and a second delay signal represen- 
tative of the time delay between said first reference 
signal and a frequency-dependent signal generated by 
said first array and said second reference signal and a 
frequency-dependent signal generated by said second 
array, and means for generating a position signal, as a 
function of said first delay signal and said second delay 
signal, representative of the posidon of said detected 
signal relative to said first and second arrays. 

« 41 « * 4i 



S()liM)IN(;S 



CODEN: JASMAN 



The Journal 



ISSN: 0001-4966 



of the 

Acoustical Society of America 



Vol. 110, No. 6 



December 2001 



ACOUSTICAL NEWS—USA 

USA Meetings Calendar 

ACOUSTICAL NEWS— INTERNATIONAL 

International Meetings Calendar 

BOOK REVIEWS 

REVIEWS OF ACOUSTICAL PATENTS 



2811 
2811 

2813 
2813 

2815 

2819 



LETTERS TO THE EDITOR 

Transverse, normal modes of vibration of a cantilever Hmoshenko 
beam with a mass elastically mounted at the free end [40] 

Acoustical design of Benaroya Hall, Seattle [55] 

Comparison of voice Fq responses to pitch-shift onset and offset 
conditions [70] 



GENERAL LINEAR ACOUSTICS [20] 

Generation of very high pressure pulses with 1-bit time reversal in 
a solid waveguide 

Scattering from a ribbed finite cylindrical shell 

On the complex coi^ugate roots of the Rayleigh equation: The 
leaky surface wave 

Small-slope scattering from rough elastic ocean floors: General 
theory and computational algorithm 

Effect of circumferential edge constraint on the acoustical 
properties of glass fiber materials 

UNDERWATER SOUND [30] 

Interpretation of the spectra of energy scattered by dispersed 
anchovies 

Extinction theorem for object scattering in a stratified medium 

On the relative role of sea-surface roughness and bubble plumes in 
shallow-water propagation in the low-kilohertz region 



C. A. Rossit, R A. A. Laura 2837 

Cyril M. Harris 2841 

Charles R. Larson, Theresa A. 2845 
Burnett, Jay J. Bauer. Swathi 
Kiran, Timothy C. Hain 



Gabriel Montaldo, Phillippe Roux, 2849 
Amaud Derode, Carlos Negreira, 
Mathias Fink 

Michel Tran-Van-Nhieu 2858 

Christoph T. Schroder, 2867 
Waymond R. Scott, Jr. 

Robert R Gragg, Daniel Wurmser, 2878 
Roger C. Gauss 

Bryan H. Song, J. Stuart Bolton, 2902 
Yeon June Kang 



Orest Diachok 2917 

Pumima Ratilal, Nicholas C. 2924 
Makris 

Guy V. Norton, Jorge C. Novarini 2946 



(Continued) 



SOUNDlMiS 



THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA VOL. 110, NO. 6, DECEMBER 2001 

CONTEWTS— Continued from preceding page 



Interface scattering by poroelastic seafloors: First-order theory 



Kevin L. Williams, James M. 
Grochocinski. Darrell R. Jackson 



2956 



ULTRASONICS, QUANTUM ACOUSTICS, AND PHYSICAL EFFECTS OF SOUND [35] 

The propagation of ultrasound within a gas jet D. A. Hutchins, C. S. Mclntyre. 

D. W. Choi, D. R. Billson, T. J. 
Robertson 



Acoustic attenuation in a tlu'ee-gas mixture: Results 



Yefim Dain, Richard M, Lueptow 



2964 



2974 



TRANSDUCTION [38] 

Measurement of electrostrictive coefficients of polymer films 



Francois M. Guillot, Jacek 
Jarzynski, Edward Balizer 



2980 



STRUCTURAL ACOUSTICS AND VIBRATION [40] 

Transient flexural waves in a disk and square plate from off-center 
impact 

Interpretation and identification of minimum phase reflection 
coefficients 

On the emergence of the Green's function in the correlations of a 
diffuse field 

An approximate analytic solution for the radiation from a 
line-driven fluid-loaded plate 

Analysis and measurement of a matched volume velocity sensor 
and uniform force actuator for active structural acoustic control 

Polynomial relations for quasi-static mechanical characterization of 
isotropic poroelastic materials 



Michael El-Raheb, Paul Wagner 2991 

J. Gregory McDaniel, Cory L. 3003 
Clarke 

Gleg I. Lobkis, Richard L. Weaver 3011 

Daniel T. DiPema, David Fait 3018 

R Gardonio, Y.-S. Lee. S. J. Elliott, 3025 
S. Debost 

Christian Langlois. Raymond 3032 
Panneton. Noureddine Atalla 



NOISE: ITS EFFECTS AND CONTROL [50] 

Active control of the volume acquisition noise in functional 
magnetic resonance imaging: Method and psychoacoustical 
evaluation 



John Chambers, Michael A. 
Akeroyd, A. Quentin Summerfield, 
Alan R. Palmer 



3041 



ARCHITECTURAL ACOUSTICS [55] 

A power conservation approach to predict the spatial variation of 
the cross-sectionally averaged mean-square pressure in reverberant 
enclosures 

A profiled structure with improved low frequency absorption 

An acoustic boundary element method based on energy and 
intensity variables for prediction of tiigh-frequency broadband 
sound fields 

The Monte Carlo method to determine the error in calculation of 
objective acoustic parameters within the ray-tracing technique 

On the sound insulation of wood stud exterior walls 



Linda P. Franzoni 3055 



Tao Wu. Trevor J. Cox, Y. W. Lam 3064 

Linda P Franzoni, Donald B. Bliss, 3071 
Jerry W. Rouse 



Javier Giner, Carmelo Militello, 3081 
Amando Garcia 

J. S. Bradley, J. A. Birta 3086 



PHYSIOLOGICAL ACOUSTICS [64] 

Origin of the bell-like dependence of the DPOAE ampUtude on 
primary frequency ratio 

A human nonlinear cochlear filterbank 



Andrei N. Lukashkin, Ian J. 
Russell 

Enrique A. Lopez-Poveda, Ray 
Meddis 



3097 
3107 



(Continued) 



S()UNI)IN(;S 



THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA VOL. 110, NO. 6, DECEMBER 2001 

CONTENTS — Continued from preceding page 



Distortion product otoacoustic emission input/output functions in 
normal-hearing and hearing-impaired human ears 



Effects of draining cochlear fluids on stapes displacement in human 
middle-ear models 

Multicomponent stimulus interactions observed in basilar- 
membrane vibration in the basal region of the chinchilla cochlea 

DPOAE suppression tuning: Cochlear immaturity in premature 
neonates or auditory aging in normal-hearing adults? 

Energy-independent factors influencuig noise-induced hearing loss 
in the chinchilla model 



Patricia A. Dom, Dawn 3119 
Konrad-Martin, Stephen T. Neely. 
Douglas H. Keefe, Emily Cyr, 
Michael P. Gorga 

Richard M. Lord, Eric W. Abel. 3132 
^igang Wang, Robert P. Mills 

William S. Rhode. Alberto Recio 3140 
Carolina Abdala 3155 
Roger R Hamemik. Wei Qiu 3163 



PSYCHOLOGICAL ACOUSTICS [66] 
Towards a measure of auditory-filter phase response 



Andrew J. Oxenham, Torsten Dau 



3169 



SPEECH PRODUCTION [70] 

Spatio-temporal analysis of irregular vocal fold oscillations: 
Biphonation due to desynchronization of spatial modes 

A method of applying Fourier analysis to high-speed laryngoscopy 
Effects of ethanol intoxication on speech suprasegmentals 



Surrogate analysis for detecting nonlinear dynamics in normal 
vowels 



Jiirgen Neubauer, Patrick Mergell, 3179 
Ulrich Eysholdt, Hanspeter Herzel 

Svante Granqvist, Per-Ake 3193 
Lindestad 

Harry Hollien, Gea DeJong, 3198 
Camilo A. Martin, Reva Schwartz, 
Kristen Liljegren 

Isao Tokuda, Takaya Miyano, 3207 
Kazuyuki Aihara 



SPEECH PROCESSING AND COMMUNICATION SYSTEMS [72] 

A two-microphone dual delay-line approach for extraction of a 
speech soimd in the presence of multiple interferers 



Improvements in intelligibility of noisy reverberant speech using a 
bmaural subband adaptive noise-cancellation processing scheme 



Chen Liu, Bruce C. Wheeler, 3218 
William D. O'Brien, Jr., 
Charissa R. Lansing, Robert C. 
Bilger, Douglas L. Jones, Albert S. 
Feng 

Paul W. Shields, Douglas R. 3232 
Campbell 



BIOACOUSTICS [80] 

Ultrasonic properties of random media under uniaxial loading 

A point process approach to assess the frequency dependence of 
ultrasound backscattering by aggregating red blood cells 

Differential degradation of antbird songs in a Neotropical 
rainforest: Adaptation to perch height? 

Fundamental precision limitations for measurements of frequency 
dependence of backscatter: Applications in tissue-mimicking 
phantoms and trabecular bone 

Suppression of large intraluminal bubble expansion in shock wave 
lithotripsy without compromising stone conmiinution: Methodology 
and in vitro experiments 

Auditory display of knee-joint vibration signals 



M. R Insana, T. J. Hall, P. 
Chaiurvedi. Ch. Kargel 

David Savery, Guy Cloutier 

Erwin Nemeth, Hans Winkler, 
Torben Dabelsteen 

Keith A. Wear 



Pei Zhong, Yufeng Zhou 



Sridhar Krishnan, Rangaraj M. 
Rangayyan, G. Douglas Bell, 
Cyril B. Frank 



3243 
3252 
3263 
3275 

3283 

3292 



(Continued) 



SOILNDINCJS 



THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA VOL. 110, NO. 6, DECEMBER 2001 

CONTENTS — Continued from preceding page 



Three-dimensional modeling of hearing in Delphinus delphis James L. Aroyan 3305 

Numerical analysis of ultrasonic transmission and absorption of Mark Hayner, Kullervo Hynynen 3319 
oblique plane waves through the human skull 

INDEX TO VOLUME 110 

How To Use This index 3331 

Classification of Subjects 3331 

Subject Index To Volume 110 3336 

Author Index To Volume 110 3378 



Document Delivery: Copies of journal articles can be ordered from 
DocumentStore, our online document delivery service (URL: 
http://documentstore.org/). 



A two-microphone dual delay-line approach for extraction 
of a speech sound in the presence of multiple interferers^^ 

Chen Liu,''* Bruce C, Wheeler, William D. O'Brien, Jr., Charissa R. Lansing, 
Robert C. Bilger, Douglas L. Jones, and Albert S. Feng^* 

Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, 
Urbana, Illinois 61801 

(Received 28 June 2000; revised 14 June 2001; accepted 19 September 2001) 

This paper describes algorithms for signal extraction for use as a front-end of telecommunication 
devices, speech recognition systems, as well as hearing aids that operate in noisy environments. The 
development was based on some independent, hypothesized theories of the computational 
mechanics of biological systems in which directional hearing is enabled mainly by binaural 
processing of interaural directional cues. Our system uses two microphones as input devices and a 
signal processing method based on the two input channels. The signal processing procedure 
comprises two major stages: (i) source localization, and (ii) cancellation of noise sources based on 
knowledge of the locations of all sound sources. The source localization, detailed in our previous 
paper [Liu et al, J. Acoust. Soc. Am. 108, 1888 (2000)], was based on a well-recognized biological 
architecture comprising a dual delay-line and a coincidence detection mechanism. This paper 
focuses on description of the noise cancellation stage. We designed a simple subtraction method 
which, when strategically employed over the dual delay-line structure in the broadband manner, can 
effectively cancel multiple interfering sound sources and consequently enhance the desired signal. 
We obtained an 8-10 dB enhancement for the desired speech in the situations of four talkers in the 
anechoic acoustic test (or 7-10 dB enhancement in the situations of six talkers in the computer 
simulation) y/hen all the sounds were equally intense and temporally aligned. © 2001 Acoustical 
Society of America, [DOI: 10.1121/1.1419090] 

PACS numbers: 43.72.Ar, 43.72.Dv, 43.60.Bf [DOS] 



L INTRODUCTION 

Selective hearing is a useful mechanism for extracting 
desired signals in complex acoustic environments such as a 
cocktail party. This so-called "cocktail party" effect has been 
studied psychophysical ly (Cherry, 1953; Blauert, 1983; 
Bregman, 1990; Bronkhorst and Plomp, 1992). The ability to 
hear in complex acoustic environments is largely attributed 
to the capacity to discern the spatial origins of sound sources. 
The neural circuitry and the underlying mechanisms for 
sound localization are fairly well established (Konishi et at,, 
1988; Takahashi and Keller, 1994; Yin and Chan, 1990). 
Sound localization involves binaural processing of minute 
differences in time, intensity, and spectrum between the two 
ears. However, although we know the capacity of the audi- 
tory system to selectively attend to sounds originating from 
one source and suppress the other sounds in the ambiance, 
the underlying mechanisms for doing so are largely un- 
known. Therefore, designing an artificially intelligent system 
today to achieve selective hearing is still largely based on our 
relatively rich knowledge of the physical world (e.g., signal 
processing techniques) plus our limited knowledge of the 
biological world. 

One of the prominent noise suppression concepts is the 
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EC or equalization-and-cancellation scheme of Durlach 
(Durlach, 1960, 1972). It requires two inputs followed by a 
two-stage signal processing: (i) equalization that makes the 
noise components identical in both channels; and (ii) cancel- 
lation or subtraction of die noise components in one channel 
from those in the other channel. Actually most two- 
microphone-based noise cancellation techniques to date (e.g., 
Widrow etai, 1975; Strube, 1981; Chabries et at., 1982; 
Chazan etai, 1988; Weiss. 1987; Peterson et ai, 1987) are 
essentially variants of the EC scheme and differ primarily in 
the procedures by which the filter parameters are adapted. 
Thus far, these have rendered satisfactory noise reduction 
only for situations in which there are one desired source and 
one noise source. 

Oiir noise cancellation technique described herein also 
falls in diis category. However, it is devised so as to cancel 
multiple noise sources more efficiently by capitalizing on the 
knowledge of the spatial directions of the sound sources in 
the environment. For the purpose of sound localization, we 
have designed a system (Liu et a/., 2000) based on a broad- 
band **dual delay-line" structure and the coincidence detec- 
tion principle of Jeffress (1948). Our noise cancellation tech- 
nique also adopts the dual delay-line as the infrastructure. 

So far the Jeffress model has been studied and various 
modifications have been developed to account for different 
psychological observations (see reviews in the book chapters 
by Colbum and Durlach, 1978; Colbum, 1996; and Stem and 
Trahiotis, 1995. 1997). It was only recently that the Jeffress 
model began to be considered for use in the extraction of 



3218 J. Acoust. Soc. Am. 110 (6), December 2001 000 1 -4966/200 1/110(6)/32 18/1 4/$ 18. 00 © 2001 Acoustical Society of America 



A/D 



Lt» AJD 




dual delay-line/ 
"subtraction" 
(Eq.H) 



Eq.(I6) 
{ 



dual delay-line/ 
"subtraction" 
(Eg. 11) 




integration 
over frequency 



integration 
over time 



integration 
over frequency 



integration 
over time 



noise 
source 
localization 
(Eq. 22) 



iki: 



pick up X*'--»(/n) 
for output, i.e., 
equivalent Co (Eq. 24) 



frequency 
synthesis 



5,('n) = -V;'-'(/n) 

FIG. 1. The block diagram of System I for extraction of the desired source, whose location is known a priori^ in the presence of one noise source whose 
location is estimated by the system. 



speech in noise (e.g., Bodden. 1993; Banks, 1993). Since the 
model maps die acoustic space into a network, one appealing 
feature is the potential for detecting the number and die azi- 
muths of sound sources present in the auditory space. This 
provides the mechanism by which the system can focus on, 
and extract the signal from, one desired source direction, 
while at the same time suppressing the sounds arising from 
the other directions. In his acoustic processor. Bodden (1993, 
1996) basically took the Jeffress' coincidence sound localiza- 
tion models, as implemented by Lindemann (1986) and Gaik 
(1993), and added a time-variant Wiener filter for noise can- 
cellation after sounds had been localized. However, since it 
is impossible to obtain an accurate estimate of the power 
density spectra of both the desired and noise signals, the 
result will always have residual noise and some cancellation 
and distortion of the desired signal. 

The work described herein was motivated by the need to 
find a general solution for signal extraction in real world 
situations where there are multiple (>2) concurrent sound 
sources. Our signal extraction technique evolved from a sub- 
traction procedure. Note that, interestingly, subtraction is 
also employed in the directional hearing mechanism with a 
pressure-gradient receiver (Feng and Shofner, 1981). Theo- 
retically, a conventional noise cancellation system using a 
two-microphone array performs well when there are two 
sources but its performance degrades rapidly as the number 
of sources increases. To attack this problem, we developed a 
broadband noise cancellation strategy, making the two- 
microphone array subtraction approach more effective by 
taking advantage of the dual delay-line structure. 

In this paper, we first introduce a subtraction method, 
which is the core of our noise cancellation technique. The 
subtraction procedure is then extended via the broadband 
dual delay-line structure for cancellation of multiple sources. 
In Sec. n A, we describe the subtraction procedure in the 
context of extracting a desired source at a known location in 
the presence of one interfering source at an arbitrary loca- 
tion. The subtraction operation is mathematically analyzed in 



Sec. II B. Section II C gives a beamforming interpretation of 
the subtraction method. In Sec. II D, the method is general- 
ized to situations in which neither the location of the desired 
source nor that of the interference is known. Section m de- 
scribes a strategy for extending the method to a system suit- 
able for cancellation of multiple interfering sources. Section 
IV presents the experimental results and analysis. Discussion 
of several practical issues is given in Sec. V. 



II. INTRODUCTION TO THE NEW CANCELLATION 
SCHEME 

A. Cancellation algorithm based on the dual delay- 
line structure 

In this section we will describe a new noise cancellation 
algorithm. It is fundamentally a subtraction operation applied 
on the two input signals. The signals are received by two 
microphones, which are paired with a fixed inter-microphone 
distance. The subtraction is conducted based on the infra- 
structure of the dual delay-line network in the frequency do- 
main. A block diagram of the basic signal processing system 
(System I) is shown in Fig. 1. The two inputs, Xin{t) and 
x/f„(r), are digitized, their digital versions being Xin{k) and 
XRnW, respectively. Their spectra. Xi„(ni) and Xf^^im), 
m= 1, . . . ,Af , are obtained through discrete Fourier trans- 
form (DFT). The subscripts L and R denote left and right 
channels, and n the frame index of the short-term Fourier 
analysis. 

For clarity, we shall focus on the system description for 
an arbitrary frequency a>„ . For each frequency, the complex 
signals from the two channels are fed into a pair of delay- 
lines (Fig. 2), both of which are composed of / delay units 
with delay values r, (i = 1, . . . ,/) given by 

= — - — ^^"\7^'^~ yj' ' = (1) 
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FIG. 2. The dual delay-line used 
the basic structure in our system. 



X,.(m) 



where ITDmax=^/^ is the maximum inter-microphone time 
difference. D is the inter-microphone distance, c is the speed 
of sound, and / is an odd number greater than 1. By using 
Eq. (1), it can be shown that the time delays are antisyrmnet- 
ric with respect to the midpoint (/+ 1)/2, i.e., 



i=l,...,(/+2)/2-l. 



(2) 



It is noted that if there is no diffracting object between the 
two microphones, the horizontal plane can be uniformly di- 
vided into / sectors with the azimuth of each sector being 



2 /-I 



TT, 1=1 /. 



(3) 



Therefore, the azimuths can be mapped one-to-one onto the 
corresponding positions in the delay line 



rrD„ 



Ti=-- 



sin Oi, 1 = 1,... ./. 



(4) 



Note that the resolution of the dual delay-lines representing 
the spatial azimuth is determined by the values of the time- 
delay units Ti that have a time unit such as millisecond. As 
will be described in detail next, the delays are applied to the 
left and right input signals at each frequency in the frequency 
domain. Thus the dual delay-line works like rotating two 
separate phasors in the opposite directions in the complex 
plane [Eqs. (9) and (10)] until they are in phase, i.e., the 
so-called coincidence operation. The step size of the rotation, 
exp(— yco^r,), can be arbitrarily small in the frequency do- 
main. Therefore, the azimuthal resolution of die dual delay- 
lines is not controlled by the sampling rate. Some other rel- 
evant discussions will be given in Sec. U C. 

Figure 2 shows that the dual delay-line structure is simi- 
lar to that adopted previously for sound localization (Liu 



et a/., 2000) except that a compensation element ai(m) has 
been added following each delay unit. These elements, which 
compensate for differences in the intensity of noise at the 
two microphones, are functions of both azimuth and fre- 
quency. Appendix A derives the compensation values for the 
ideal case of point sources with distance-dependent ampli- 
tude decline, in a lossless medium. In practice, however, all 
the values of a;(m) and (/= 1, . . . ,/) are to be adjusted 
empirically the same time when the system is being cali- 
brated to compensate for asynmietries between the two mi- 
crophones. The compensation factors remain fixed so long as 
the asymmetries are not changed. This fixed interaural inten- 
sity difference (IID) corresponding to each interaural time 
difference (ITD) mimics that observed in humans (Gaik, 
1993). In the anechoic chamber tests reported below, die val- 
ues of ITD units (i= 1. . . . ,/) were set uniformly while 
the values of IID a,(m) (i = 1, . . . ,/) were determined em- 
pirically. 

In this subsection let us suppose the direction of the 
desired source is known a priori and we use f signal =J to 
denote the in-phase (coincident) position along the dual 
delay-line for the desired signal components. We use iao\sc 
=^ to denote the in-phase position for the noise signal com- 
ponents. Note that the position index along die dual delay- 
line is coincident widi the index of the delay units in the left 
channel. After equalization, the in-phase desired signal com- 
ponents are identical in the left and right channels at i signal 
= 5, which is assumed to be 5„(m)=/ij exp|j(£o„f-l- <^j)]; 
likewise, the in-phase noise signal component is identical in 
the left and right channels at i"noisc=^» which is assumed to 
be G„{m)=Ag expIj(co„f+ ^p], where (f>^ and <f>g are die 
initial phases for signal and noise, respectively. Based on 
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these assumptions, the left and right channel input (micro- 
phone) signals are, respectively, 



and 



1 



1 



and 



1 

1 



S„(m)exp(ya>„T,) 
G„{m)t\pija)„Tg) 



(5) 



5„(m)exp0a>„r;_,+ ,) 

— G„{m)cxp{jo)^rj_^^). (6) 
a/_^+i(m) 

Then, we can find the mathematical representation for the 
equalized signals a,(/n)X['^(/n) for the left channel, and 
a/_/+i(m)X^'^{/w) for the right channel at any arbitrary 
point i (except i = 5), along the dual delay-line. They are 



0Ci(m) 



+ 



5„(m)exp[ya)„(r,-r,)] 

G„(m)cxp[j(o^(Tg-Ti)] (7) 



a,_,^i(m)4'>(m)= [ s„{m) 



a/_^+i(m) 
Xexp[ya>„(T,_,+ i-r/_,.+ j)] 

+ —GnM 

a/_^ + ,(m) 

Xexp[ya>Jr/_^ + ,-T/_,+i)]» 



where 



and 



^/?n('") =^R„(w)exp( -j(o„rj^i^ i)- 



(8) 



(9) 



(10) 



The subtraction step in the algorithm performs the fol- 
lowing operation on each signal pair, a^{m)Xil(m) and 
a/_, + i(/n)Xj/^(m), for /=1,...,/, at any location along 
the delay line except the location where i = s: 



[a,(m)/a,(m)]exp|ja)Jr,-r,-)]-[a/_, + i(m)/a,_,+ ,(/n)]exp[;a>„(r/_, + i-r^^ 



for 1^5. (11) 



A caveat in using Eq. (11) is diat if the value of the denominator is too small, a small positive constant e is added to limit the 
magnitude of X^^\m), 

B. Physical meaning of the delay-line subtraction operation 

To analyze the operation, Eq. (11) can be expressed in the following form via substitution of Eqs. (7) and (8): 



x;.''(m) = 5„(m) + G„(m)vi;^(m), i¥^s, 
where 



(12) 



u), . [a,(/n)/ag(m)]exp[ya)Jrg-T ,)]-[a;_, + |(m)/a;-g.^i(m)]exp[ja)jT;.^i- 

^sgM- . i'^s. (13) 

[a,(m)/a,(m)]exp|jw^(r,-T,)]-[a;_,+ i(m)/a/_,+i(m)]exp[jo>„(r;_,+ 

Equations (11) and (13) can be simplified when die antisymmetric relationship in Eq. (2) is used. Thus, 

/ ai{m)X^Jl{m) - ay_,-+ ,(m)Xj/i(m) 

X\;\m)= ^" ^" , for 19^^, (14) 

[aXw)/aj(m)]exp[ya>;„(r,-r,)]-[a/_,+ i(m)/a;_,+ ,(m)]exp[-ya)„(Tj-r^)] 

and 

(0/ X [«K'«)/««(w)]exp[7o>Jr -r,.)]-[a;_, + i(m)/a;_ +i(m)]exp[-;w„(r^-r^^^ 

u;'am)= ' ' . I'^s. (15) 

[a,.(w)/a,(m)]exp[;a>„(r,-r,-)]-[a;_,+ i(/n)/a,_,+ ](m)]exp[-yw„(r,-T,-)] 
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When ignoring the compensation factors a/(An), an interest- 
ing observation of the subtraction [Eq. (11) or (14)] is that it 
computes the difference between each pair of taps at the ith 
location divided (shifted) by a factor that is determined only 
by the difference in time delay between that location and the 
location corresponding to the desired signal. Next we will 
show that Eq. (11) performed at the location / in the dual 
delay-line corresponding to the noise source will cancel the 
noise signal and provide an estimate of the desired signal. 
Moreover, the location can be found using an energy quan- 
tity. 

A signal vector containing all the frequency components 
for the preceding N time frames is x^'^=(X^/\ 1 ), 

Xi'\2), . . . ,Xi'>(M).Xy>( 1 ), . . . ,Xf(M) X]^\ 1 ), 

. . . JC]!f\M))^, / = 1, . . . ,/, where T denotes vector trans- 
position. The energy £[x^'^] of vector x*'* is 

N M 

n = l m = l 
N M 

= 2 2 |5„(m) + C„(m)v(';>(/n)|^ 

n=l m=l * 

i=l...../, (16) 
where the energy of the signal is 

£[X'"(m)] = |Xi'>(m)|^=|5„(m) + G„(/n)v<';>(m)p. 

(17) 

To separate the complex signal into the desired signal and 
noise, we define the following vectors in the similar manner 

s=(5,(l).5,(2), . . . ,5i(A/),52(l). . . . MM), . . . , 

and 

g(')=(G,(l)v<:](l).G,(2)i/;](2)...., 

G,(M)v<;](M).C2(i)i/;;>(i)..... 

Gz(M)«f;>(M),....G^(l)i/J';)(l),.... 
G;v(M)t;<;>(M))^ 
where 1 = 1 /. The energy of s and g*'' are, respectively, 

N M 

£[s] = ||s||l=2 2 |5„(m)P (18) 



and 

£[g<"] = llg"'llhi i |G„(m)v(;>(m)P. 

1=1,...,/, (19) 

In general, the desired signal and the noise signal are 
independent. Thus, vectors s and g^'^ are orthogonal. Accord- 
ing to the Pythogoras Theorem, we would have the following 
relationship: 

= l|s||^l|g^''>||^£[s] + £[g^^^^^ 1 = 1,...,/. 

(20) 

Because llg^'^Hj^O, 

£[xt''>] = ||x(''>||^>||s||^ = £[s], i=l /. (21) 

The equality in Eq. (21) is satisfied, or equivalently 
min£[x^'^] occurs, only when £^[g^'^] = ||g^'^|l2 = 0» which 
happens in either of the following two conditions: 

(a) When G„(m) = 0, i.e., the noise source is silent. In 
this case, there is no need for doing localization of the noise 
source and noise cancellation. 

(b) When u^p^(m) = 0, Eq. (15) indicates that this case 
corresponds to / = g = 'noise - Therefore, £[x^'*] has its mini- 
mum at i=^ = 'noise t^c minimum value, according to 
Eq. (21), is £[s]. Thus. 

£:[s] = E[xt 'noise)] =:niin E[\^% (22) 

When t = inoise» (12) provides 

5„(m) = X^'«'^\m) 

= 5„(m) + G„(m)i.*;--\m) = 5„(m). (23) 

In other words, in the presence of one desired source and 
one noise source, the subtraction operation [Eq. (11)] applied 
at the location i = ^( = 'noise) ^^^^ delay-line structure 
can produce an accurate estimate of the desired signal. 
Namely, 



a,(m)X<5,>(m)-a;_,+ i(/n)Xjfj(m) 



[aj(m)/a,(m)]exp[ya>„(T,-T^)]-[a/_^+i(m)/a/_, + i(m)]exp|jci>jT^_,+ |-T/_^+^ 



(24) 



The above analysis with energy also suggests a simple 
method to estimate the location ^ = 'noise of the noise source 
in the two-source situation where the direction of the desired 
source is known a priori. Specifically, localization of the 
noise source can be conducted by finding the location /'noise 



along the dual delay-line that produces the minimum value 
of £[x^'^] [Eqs. (11). (16). and (22)]. Once the location i„oise 
is determined, the azimuth of the noise source is easily de- 
termined by using Eq. (3). The estimated noise location /noise 
can be fed back to the dual delay-line for noise cancellation 
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and extraction of the desired signal using Eq. (24). 

In Fig. 1, the blocks labeled "Integration over Time" 
and "Integration over Frequency" together calculate the en- 
ergy E[x^'^] defined in Eq. (16). The block labeled "Noise 
Source Localization" locates the minimum point of £[x^'^]. 
and then supplies this as the estimate of /'noise the dual 
delay-line. Since all the components X^^\m), i = l, . . , ,/, 
have been computed at the localization step, now we only 
need to pick up the appropriate component X^^°°^\m), i.e., 
S„{m); Eq. (24) does not need to be actually executed in this 
case. Note that all the frequency computations so far are 
conducted on the first half (m= 1, . . . ,Af 1/2) of the whole 
bandwidth. The block labeled "Frequency Synthesis" de- 
rives the second half (m = M 1/2+ 1, . . . ,M) by means of the 
symmetry property of the inverse discrete Fourier transform 
(IDFT) and then conducts the IDFT to generate the time- 
domain signal Jnik). 



C. Beamforming interpretation of the delay-line 
subtraction operation 

This system can be understood conceptually by an 
equivalent beamforming procedure. Equation (11) can be ex- 
pressed in the following form: 

X^:\m) = w'll{m)X^ll(m) + w^^l(m)X^l(m), (25) 

where w^'^(m) and w^^l(m) are beamforming weights. That 
is, for each location along the dual delay-line at each fre- 
quency, a specific nulling pattern is generated with the null 
pointed toward the direction corresponding to the delay-line 
location while the gain in the presumed target direction is 
kept imity. Figure 3(a) shows a broadband intelligibility- 
weighted beampattem (for definition, see Appendix B) for 
selected nulling directions at -80"*, -60°, -40°, -20°, 20°, 
40°, 60°. and 80° azimuth (labeled A through H, respec- 
tively) with the desired source at 0° azimuth. It can be seen 
that Eq. (11) actually positions a null in the direction of the 
noise source while keeping the broadband gain always unity 
in the direction of the desired source. Since each nulling 
pattern uses only 2 degrees of freedom, i.e., w^^l(m) and 
w^^l{m)y to satisfy the two constraints (directions of null and 
unity-gain), the null patterns are fixed and there is no room 
to play optimization on the pattern shape. As will be pre- 
sented in Sec. in, this study, by taking advantages of die dual 
delay-line network, the estimated source locations, as well as 
the broadband characteristics of dialog speech, sought to find 
an appropriate strategy [which is a nonlinear one as shown in 
Eq. (27)] to utilize die simple null patterns for target extrac- 
tion among multiple interferers. 

To extend the discussion on the azimuthal resolution of 
the dual delay-lines in Sec. 11 A, let us look at a distin- 
guished feature of Eq. (11). In the numerator of Eq. (11), the 
signals in the two channels can be phase-shifted by any ar- 
bitrary (small) values in the frequency domain. However, the 
denominator eliminates the effect and thus 
Xll\ni) contains an intact component of the desired signal 
5„(m). Moreover, at the location i = g~iao\st where the 
noise component is completely cancelled, only the desired 
signal is left in the result. If interpreted as a beamformer, Eq. 




-20 0 20 
azimuth (deg) 



FIG. 3. (a) The intelligibility-weighted beampattem created by Eq. (11) for 
the cases where the desired source was always at 0** azimuth while the noise 
source was at -80° (A), -60** (B), -40° (C), -20° (D), 20'' (E), W (F). 
60° (G), and 80° (H) azimuth, respectively. The inter-microphone distance 
in this example was 144 mm. (b) The null-width of the intelligibility- 
weighted beampattem at -30 dB as a function of azimuth. 

(11) Operated on the dual delay-line in the frequency domain 
enables a null steering precisely in any arbitrary direction 
regardless of the sampling rate. 

D. Extended application 

The method suggested in the preceding subsection for 
localization and cancellation of the noise source is valid only 
when the direction of the desired source is known a priori. It 
cannot be directly applied in the situations where the direc- 
tion of the desired source is also unknown. Therefore, we 
designed another system (System IE in Fig. 4). The operation 
of this system is divided into two steps: it localizes both the 
desired source and noise source, and then selectively extracts 
the signal from the desired direction. The localization step 
employs an efficient localization method comprising dual 
delay-line coincidence detection followed by a nonlinear op- 
eration and then temporal and spectral integrations. The 
method was described in detail in a previous paper (Liu 
etai, 2000) in which it was shown to accurately localize 
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FIG. 4. The block diagram of System 
II for extraction of one desired source 
in the presence of a noise source when 
both source locations have to be esti- 
mated by the system. See Liu et al 
(2000) for details about the block 
"Broadband Localization." 



four talkers in an anechoic room and six talkers in computer 
simulation. Thus the locaUzation block in Fig. 4 determines 
the in-phase positions, ijignai^^ inoise"^* of both the 
desired and noise sources along the dual delay-line, which 
were then used by the subtraction in Eq. (24) for extracting 
the desired signal 5„(ot). That is, except for the separate 
source localization step. System n employs the same noise 
cancellation method as described in the preceding subsection 
[Eq. (24)]. 

In comparison with System I. System n (without the 
assumption of direction of the desired source) is functionally 
more flexible. For example, in a situation with two talkers, 
there is no need to align the dual microphones physically 
toward one talker, and either talker can be taken as the de- 
sired one. The user can choose between the two sources at 
any time by using an electronic switch instead of changing 
the pointing direction of the microphones. Actually die mi- 
crophones do not necessarily point to either of the sources. 

We presented System I in the preceding subsection 
mainly for illustrating the mechanism of the dual delay-line 
subtraction [Eq. (11)] and shows its capacity for both noise- 
localization and desired-extraction. However, the operation 
in System I is computationally expensive because Eq. (11) 
must be applied to each tap in die dual delay-line for local- 
ization. Moreover, its use is limited to a two-talker (with the 
direction of the desired talker known a priori) situation. In 
comparison, the coincidence detection scheme for localiza- 
tion employed by System II is simpler in computation. What 
is more important is that, as we will show in the next section, 
System 11 can be further extended to situations with multiple 
interfering talkers. 

Although our localization method worked quite well in a 
multiple-source environment, we normally observed rela- 
tively larger and more frequent localization errors for the 
lateral sources (Liu et al, 2000). The robustness of the noise 
cancellation to localization errors can be roughly estimated 
by looking at the null- width of the nulls in Fig. 3(a). For 
example, the null-width evaluated at -30 dB is shown in 
Fig. 3(b). It shows that the width is wider when the direction 
of the null is farther away from the midline; that is. the noise 
cancellation method can tolerate bigger localization errors 
for lateral sources. Therefore, the greater localization errors 



for lateral sources do not degrade the system performance in 
terms of noise cancellation. 

III. STRATEGY FOR BROADBAND MULTIPLE- 
SOURCE CANCELLATION 

The greatest challenge associated with extension of the 
noise cancellation method fi^om two-source situations to 
multiple-source (>2) situations is that a two-input system in 
theory can only effectively cancel the sound from one inter- 
fering source. This is due to the fact that only one null can be 
generated in the beampattem when using a two-microphone 
array. In the narrow-band situation, an apparent solution is to 
adaptively steer the null toward the most intense noise source 
at each moment. In the broadband situation, since the input 
signal is decomposed into its frequency components, one can 
formulate a separate one-null beampattem for each fre- 
quency. When there is one noise source as described in the 
preceding section, the nulls at all frequencies are steered in 
the same direction of the single noise source. However, when 
there are more than two sources, each frequency bin can be 
treated separately so that its beampattem null is adaptively 
steered at each moment toward the noise source that is emit- 
ting the most intense energy at that frequency, while main- 
taining unity gain toward the desired source. It is actually a 
dynamic application of the subtraction operation in Eq. (24). 
This noise cancellation strategy is based on the following 
rationale: 

(a) Natural speech has many pauses and silent intervals, 
both of which usually occupy 60%-65% of the total 
time (Flanagan, 1972, p. 386). Therefore, when mul- 
tiple talkers speak simultaneously, there are always a 
number of short temporal gaps present. The number of 
overlapping talkers at each moment is usually smaller 
than the total number of talkers. 

(b) Even when multiple talkers speak at the same moment, 
different talkers likely dominate at different frequency 
bins at each moment due to the differences in articula- 
tion such as voicing and pitch. There are about ten 
phonemes per second in conversational speech, more 
than 60% of which are low-energy, high-frequency 
consonants, and less than 40% of which are high- 
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FIG. 5. The block diagram of System 
III for extraction of one desired source 
in the presence of more than one noise 
source when all the source locations 
have to be estimated by the system. 
See Liu et al. (2000) for detaUs about 
the block "Broadband Localization." 



energy, low-frequency vowels (Flanagan, 1972, p. 4). 
In the presence of multiple talkers, the talkers who are 
articulating the high-energy vowels are dominant in 
both the localization (Liu et al, 2000) and cancellation 
[Eq. (27)] and hence more easily removed. On the 
other hand, due to the asymmetry of the filtering re- 
sponse of the human ear, the masking effect of low 
frequencies on high frequencies is much stronger than 
the reverse (Jeffress, 1970, p. 95). In other words, can- 
cellation of a talker at his/her strongest frequency com- 
ponents, which are likely the major components of a 
vowel, may effectively cancel die masking effect of the 
talker. 

To obtain the information about location of each source 
for the noise cancellation algorithm, the localization algo- 
ridim in Liu et al (2000) is employed. Suppose there are Q 
noise sources with corresponding locations in the dual delay- 
line being I'ooisei .'noisc2» • • • .'noisefi- applying Eq. (24), 

we obtain X^'~^»\/n),X^'--^^(m) X^l^^<^\m) for 

each frequency o)^ . If the localization is accurate, they all 
should include the component of the desired signal at fre- 
quency u}^ as well as components from interfering sources 
other than the one to be canceled. In order to determine the 
particular noise source to be canceled, the energies of 
;^Onoisei)(^) j^(W2)(^)^ ,X^'°°^'=^\m) are calculated 

and compared. The minimum X^'"'^\ffi) is taken as output 

Urn): 

^n(m) = X^'~^\m), (26) 
where X^''"''^(m) satisfies the following condition: 
lX*'°"^^\m)p = min{|X^'--^^ 

(27) 

l<'"^^\m)|Ma»X[,^„>(m)|^}. 



By referring to the energy analysis in Sec. II B, it is easy to 
see that this strategy is logically consistent with Eq. (22). It 
is noted that, in Eq. (27), we also include the original signal 
a,{m)X^ll{m) for the following reason. The beampattem de- 
signed above sometimes may amplify other less intense 
noise sources. When the amount of noise amplification is 
larger than the amount of cancellation of the most intense 
noise source, it may be better to keep the input signal at that 
frequency at that moment unchanged. An extended system 
(System HI in Fig. 5) was developed using System n (Fig. 4) 
as the foundation. In comparison with System U. it identifies 
multiple (>2) source directions and tentatively cancels each 
noise source; specifically it cancels the instantaneously most 
intense source on a firequency-by-frequency basis [Eq. (27)]. 

The cancellation step relies on the localization step to 
provide azimuth information for each source, which is usu- 
ally a difficult task especially in the presence of multiple 
sources. However, as shown in our previous paper (Liu 
et al, 2000), our localization system can satisfactorily local- 
ize four sources in an anechoic room and six sources in 
simulation, if not more. In addition, the cancellation step 
does not have rigid requirement that all the sources must be 
localized accurately. As a matter of fact, our strategy is to 
cancel the strongest noise component at each frequency 
bin — this is usually emitted from one of those momentarily 
relatively intense noise sources, which are easy to localize 
compared with other relatively less intense sources. 



IV. EXPERIMENT 

For the case of two talkers, once the locations of the 
talkers are determined, the sound from one talker can be 
removed by using System I or System II with essentially no 
residual noise while the estimated desired signal is distor- 
tionless. This was clearly supported in theory and also pre- 
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FIG. 6. Top view of the spatial configuration of one of our experimental 
setups. This experimental setup corresponds to test configuration #2 in 
Tables I and 11. 



viously demonstrated by Banks (1993). In this paper we 
present the results of four-talker experiments using the noise 
cancellation of System HI. 

The experiments employed a record-and-play procedure 
with four high-fidelity loudspeakers (ADS 200LC) and were 
conducted in an anechoic room and in a moderately quiet 
conference room with a reverberation time constant of ap- 
proximately 400 ms. The speech materials consisted of spon- 
daic words spoken by native speakers of American English; 
these were recorded in a sound studio at the Beckman Insti- 
tute. All the speech recordings were equalized in average 
intensity and played through the loudspeakers. The words in 
each experimental condition were temporarily aligned, i.e., it 
was equivalent to all the talkers starting at the same time. 
The inter-microphone distance was 144 mm. All the loud- 
speakers were at a fixed equal distance of 1.0 m (unless 
otherwise stated) from the midpoint between the two micro- 
phones, and all the loudspeakers and microphones were at 
the same elevation (— 1 m from the floor). Correspondingly, 
the compensation factors in Eq. (11) were determined for 
that distance. 

The signals were low-pass filtered at 6 kHz and sampled 
at a 12.8-kHz rate with 16-bit quantization. In the short-term 
spectral analysis, a 20-ms segment of signal was weighted by 
a Hamming window, padded with zeros to 2048 points, and 
Fourier transformed with frequency resolution of about 6 Hz. 



Consecutive frames overlapped by 15 ms. The values of the 
time delay units (/= 1, . , . ,/) were determined such that 
the dual delay-line has a uniform azimuthal resolution of 
0.5° (i.e., 7=361). 

Two groups of talkers were used in our tests. Each group 
consisted of four different talkers speaking different spondaic 
words. Five tests were conducted for each group; each test 
adopted a different azimuthal arrangement of the sources, 
with the separation between adjacent sources ranging 10°- 
75°. Figure 6 illustrates one of the configurations. Each test 
consisted of four subtests in which each talker was taken in 
turn as the desired source with all the other talkers as the 
noise sources. The localization of the talkers was conducted 
using both the "direct'* and "stencil" methods in Liu et al 
(2000). 

The system performance was evaluated using an objec- 
tive intelligibility-weighted measure, whose concept was first 
proposed by Peterson (1989) and described in detail in Liu 
and Sideman (1996). Specifically, we used intelligibility- 
weighted signal cancellation, intelligibility-weighted noise 
cancellation, and net intelligibility- weighted gain (see Ap- 
pendix B for definition). 

As mentioned above, an anray of tests was conducted 
with a number of variables such as different talkers, different 
spondaic words, different azimudial arrangements, different 
localization methods, and different combinations of the vari- 
ables. However, it is not necessary to present all our results 
since most of the variables, as they turned out to be, have no 
statistically significant effect on the final noise cancellation 
performance. Specifically, the experimental results showed 
no statistical difference due to talkers and words spoken. It 
also showed no significant effect from using the "direct" 
method versus the "stencil" method for source localization 
(Liu etal, 2000). Therefore, we only present die results 
from Group #1 with the location information derived using 
the "stencil" method. As mentioned above, it contained five 
tests corresponding to five different spatial configurations. 
For each test, we present result from only one (arbitrarily 
chosen) of the four subtests since the location of the desired 
source has no obvious effect on noise cancellation. Table I 
shows typical results chosen from the anechoic chamber test 



TABLE I. Experiment results attained from the anechoic room using System HI. The results shown were derived from five tests (configurations) from Group 
#1 including two male speakers (Ml and M2) and two female speakers (Fl and F2). The spondaic word spoken by each talker is given in italic. The values 
in parentheses are cancellation of the desired sources in dB. Configuration test #2 is shown in Fig. 6. 

Intelligibility- weighted signal cancellation 

(dB) Intelligibility- 

— — — weighted noise 

Ml M2 Fl F2 cancellation 

Test "armchair" "playground** "pancake" "woodwork** (dB) 



Net intelligibility- 
weighted gain 
(dB) 



#1 


-75*' 


0° 


20° 


75° 








8.04 


(0.15) 


4.98 


3.07 


9.25 


9.09 


#2 


30° 


-45' 


60° 


-10° 








8.34 


4.71 


4.12 


(0.67) 


8.38 


7.71 


#3 


10° 


-80° 


-50° 


45° 








(0-55) 


6.90 


5.57 


3.83 


8.56 


8.00 


#4 


-30° 


15° 


5° 


-60° 








10.53 


2.07 


(1.14) 


6.35 


8.27 


7.13 


#5 


-25° 


25° 


-70° 


80° 








8.09 


(0.34) 


5.82 


4.46 


8.78 


8.44 
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TABLE II. Same as Table I except that the recordings were made in a moderately quiet conference room with a 400 ms reverberation time [RT was derived 
using Schroeder's method; see J. Acoust. Soc. Am, 37, 409-412 (1965)]. 



Intelligibility-weighted signal cancellation 
(dB) 



Test 



Ml 

"armchair" 



M2 

"playground" 



Fl 

"pancake " 



F2 

"woodwork" 



Intelligibility- 
weighted noise 
cancellation 
(dB) 



Net intelligibility- 
weighted gain 
(dB) 



#1 


-75° 


0° 


20° 


75° 








4.82 


(0.55) 


4.07 


2.06 


6.73 


6.18 


#2 


30'' 


-45° 


60° 


-10° 








6-27 


4.18 


3.09 


(0.58) 


7.26 


6.69 


#3 


10° 


-80° 


-50° 


45° 








(1.12) 


3.85 


2.91 


2.71 


5.75 


4.63 


#4 


-30° 


15° 


5° 


-60° 








6.29 


(0.85) 


0.91 


3.61 


6.16 


5.25 


#5 


-25° 


25° 


-70° 


80° 








5.70 


(0.69) 


4.28 


2.92 


6.97 


6.29 



while Table n the results from the conference room test. In 
the tables, the numbers in parentheses represent the degree of 
cancellation in dB of the desired source (which should ide- 
ally be 0 dB) and the other numbers represent the degree of 
noise cancellation for each noise source. Because we had 
separate recordings of speech signals from each talker, we 
applied the same processing both on the complex signal and 
synchronously on each signal corresponding to each talker as 
well. As such, we were able to tell the effect of processing on 
each signal involved. The next to the last column in the 
tables show the degree of cancellation for all the noise 
sources lumped together, while the last column gives the net 
intelligibility-weighted improvement (which considers both 
noise cancellation and loss in the desired signal). Our results 
from the anechoic room show that, in the intelligibility- 
weighted measure, the cancellation strategy was able to can- 
cel each noise source by 3-11 dB, while the degradation in 
the desired source was very small (mostly smaller than 0.5 
dB). The total noise cancellation was between 8 and 10 dB. 
For the conference room, the cancellation was roughly 2 dB 
less, indicating that the room reverberation degraded the sys- 
tem performance somewhat. In spite of the drop in system 
performance the system still produced a sizable gain in 
speech intelligibility. 

In order to get an insight into the effect of the signal 
processing on each talker, we choose one subtest example 
(anechoic room; desired source: Fl at 60°; noise sources: Ml 
at 30'', M2 at -45°, and F2 at -10°). We display the signal 
waveform of each talker as well as the complex signal of all 
the four talkers, before [Fig. 7(A)] and after [Fig. 7(B)] the 
signal processing. Comparison of the two panels shows a 
great attenuation of the interfering talkers (Ml, M2. and F2) 
while die desired signal (Fl) is essentially not attenuated and 
the distortion of the desired talker is unperceivable. A 
moment-by-moment comparison shows that the momentarily 
strongest noise source was always reduced, indicating that 
the system adapted rapidly. The last trace in Fig. 7(B) is the 
system output, which turned out to be cleaner and closer to 
the desired speech [Fl in Fig. 7(A)] than the noisy unproc- 
essed signal [the last trace in Fig. 7(A)], 

In an informal listening experiment with normal hearing 
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listeners, we found the unprocessed signal to be impossible 
to understand, even when the spatial cues were retained. Af- 
ter the processing, however, the extracted speech from a de- 
sired source was easily understandable. 

Limited by our experimental facility, we only conducted 
on-site acoustic tests for four-talker situations. However, our 
computer simulation results for six-talker situations were 
quite similar. To avoid redundancy, we omit presentation of 
the details. Basically, we obtained a 7-10 dB enhancement 
in the intelligibility-weighted signal-to-noise ratio when 
there were six equally loud, temporally aligned speech 
sounds originating from six different sources. 



V. DISCUSSION 

There are three key differences between the algorithm 
proposed in this paper and conventional adaptive beamform- 
ers such as the Frost and Griffidis-Jim beamformers (Van 
Veen and Buckley, 1988), namely, (i) direct frequency- 
domain null steering, (ii) explicit source localization, and 
(iii) implicit utilization of dialogue characteristics. The 
frequency-domain null-steering algorithm described herein 
does rapid, independent steering of the beampattem at each 
frequency. Independent steering allows rapid steering of the 
single null at each frequency to the dominant interferer at 
that time and frequency. It provides a maximum potential to 
cancel intense components emitted from multiple interferers 
with only two inputs available. What distinguishes this 
method from other methods is that this independent steering 
can be implemented with no time delay when it is provided 
with instant localization information. When processing sig- 
nals with strong, rapidly varying time-frequency structure 
such as speech, the net effect is to allow significant cancel- 
lation of several simultaneous interferers by exploiting dif- 
ferences in their instantaneous time-frequency structures. In 
contrast, slowly adapting time-domain algorithms such as the 
Frost (Frost, 1972) and Griffiths- Jim (Griffiths and Jim, 
1982) beamformers are unable to track the nonstationary 
structure rapidly enough to achieve significant cancellation 
of more than a single interferer. This claim is clearly dem- 
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(A) Signals before processing 



(B) Signals after processing 
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FIG. 7. The signal waveform of each talker as well as the complex signal of all the four talkers t>efore (A) and after (B) the signal processing. See Fig. 6 for 
the test configuration. 



onstrated in the results of comparison experiment presented 
in Yang et al. (2000) using complete sentences under a vari- 
ety of signal-to-noise conditions. 

The performance of this algorithm is comparable to the 
conventional beamformers for the case of a single interferer, 
but markedly better for cases involving more than one inter- 
ferer. The comparisons conducted in Zheng et al (2001) 
were made in computer simulation where up to four interfer- 
ers were used at four different SNR settings (—6, —3, 0, +3 
dB). In the presence of two or more interferers, the present 
method provided 6-7 dB of SNR gains, while the Frost 
beamformer and the Griffiths-Jim beamformer had SNR 
gains in the 2-5 dB range. 

The second difference between the conventional beam- 
formers and the proposed method is that the latter explicitly 
identifies the spatial directions of the target and interferers 
via a nonlinear, cross-frequency localization procedure (Liu 
et al, 2000) and exploits this information to optimally steer 
the null pattern in each frequency bin. The localization is 
conducted on a frame-by-frame basis and the results are used 
immediately by the cancellation on the same frame. There- 
fore, as mentioned above, the adaptation time is virtually 
zero. This feature is especially important when processing 
signals with rapidly varying time-frequency structure such as 
speech. Explicit source localization also offers several odier 
potential advantages, including the ability to steer toward a 
spatially moving target, better and more robust estimation of 



signal and interference locations from which to optimize the 
beam patterns, and the ability (not explored here) to perform 
additional useful tasks such as auditory scene characteriza- 
tion. The results in Zheng et al (2001) suggest that these 
characteristics may indeed be advantageous in many situa- 
tions (with different number of interferers, different spatial 
configurations, and different SNRs), particularly when the 
interferers are in close azimuthal proximity to the target. 

The third, and most unique, difference is that our 
method takes full advantage of the characteristics and mask- 
ing effect of human dialogue as detailed in Sec. m. That 
strategy makes it possible to utilize a limited resource (two 
inputs only) to obtain maximum speech intelligibihty en- 
hancement benefits such as effective cancellation of multiple 
interfering sources. 

The improvement in signal quality reported in Tables I 
and n is encouraging but preliminary. The algorithm's per- 
formance in anechoic conditions (8-10 dB cancellation) is 
sufficient to justify further research, while the performance 
in the conference room (2 dB less cancellation) raises the 
question as to whether, when used in a real-time environ- 
ment, the quality of the cancellation will degrade so as to no 
longer be useful. Practical computational limitations re- 
stricted the work reported here, although improvements have 
allowed off-line analysis over a wider range of materials 
(Zheng etal, 2001). A related frequency domain beam- 
former (Lockwood et aly 1999) has been implemented in 
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FIG. 8. Top view of the geometry of the source- 
microphone distance. 



real-time (Elledge, 2000) with highly satisfying subjective 
sound improvement and quality (Larsen et al, 2001). A real- 
time version of the present algorithm is in the process of 
implementation; it should permit subjective evaluations to 
determine whether the technique is viable for hearing aid and 
other applications. 

One practical issue is that when the source to micro- 
phone distance is very short (e.g., 2 m or less), it is important 
to compensate for left-right differences in channel intensity; 
indeed preliminary tests indicated degradation of about 1 dB 
in the total net gain without compensation. However, for 
larger source-microphone distances (e.g., >2 m), the differ- 
ence with and without compensation was insignificant. 

VI. SUMMARY AND CONCLUSION 

In this paper, we have presented the technique and ex- 
perimental results that illustrate the performance of signal 
processing systems designed for effective extraction of a de- 
sired signal in the presence of multiple competing talkers. 
The signal processing technique is based on dual delay-line 
structure, a well-known biological network for binaural hear- 
ing. The entire system consists of two steps: localization of 
all sources and extraction of the desired source. Our anechoic 
chamber tests showed an 8-10 dB of speech enhancement in 
the presence of four equally loud, temporally aligned talkers; 
our computer simulation showed a 7-10 dB of speech en- 
hancement in the presence of six equally loud, temporally 
aligned talkers. The system can localize all the sources 
present and allow the user to selectively extract any one of 
them, hence it is more flexible than assimiing that the desired 
source is always straight ahead. It can be applied in many 
applications such as radar, sonar, conununications, and ro- 
bots. 

It is noted that in the present study we focused on sepa- 
rating out a particular talker from all the other competing 
talkers, i.e., selective hearing. It is technically straightfor- 
ward to convert the present system to a simulator that can 
capture the source to which the gaze of the listener is di- 
rected at any time instant. It is also possible to simply use 
multiple noise-cancellation components following the local- 
ization so as to extract each of the sources within the envi- 
ronment, i.e., to achieve separation of multiple signals. 

The dual delay-line structure implies that the computa- 
tion is highly parallel. That, and the repeated use of the Fou- 



rier transform, made it practical to implement the algorithm 
by means of VLSI for a fast, miniature device. 

Our future work includes evaluation using formal tests 
in normal listening rooms with human subjects with real- 
time versions of the algorithm. We will also extend our al- 
gorithms to compensate for reverberant environments. 
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APPENDIX A: CALCULATION OF THE AMPLITUDE 
FACTORS 

In this appendix we illustrate the calculation of the am- 
plification factors £if,(m), which are used to compensate for 
differences in the amplitudes of signals arriving at the two 
microphones. In this example we model the sound source as 
a simple point source, ignore the absorption of energy by the 
media, and assume the amplitude variation is solely depen- 
dent on the differences in the distance from source to the 
microphones. In this case, the amplitude compensation is 
independent of frequency. 

The amplitude of the received soimd pressure |p| varies 
with the source-receiver distance r: 



|p|«7 



(Al) 



or 



(A2) 



IpJ 

|P/el ri 

where |p£^| and |p/e| are the amplitude of sound pressures at 
the two microphones (Kinsler et al., 1982, p. 168). Accord- 
ing to the geometry in Fig. 8, the distances from the source 
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to the left and right microphones and r/f are, respectively, 

(A3) 



ri=V(/ sin <?,+D/2)^+(/ cos Bi)^ 
= ^Jl^ + lD sin ^, + D^/4 



and 



rR=^|{l sin e-Df2)^ + {l cos 6^,)^ 
= yll^-lD sin Oi + D^/A, 



(A4) 



For a pair of taps in the dual delay-line in Fig. 2, in order 
to equalize the signals at the tap outputs, the compensation 
factors a,(/n) and a/_,+ i(/«) must satisfy the following 
condition: 

|PLk/(w) = |p/i|a/_, + i(m). (A5) 

Substituting Eq. (A2) into Eq. (A5), die above condition be- 
comes 

ri a,(m) 



(A6) 



fR a/-i+i(/n) 

We define the value of a|(m) to be equal to 

ai{m) = Kyll^+lD sin ^, + 0^/4, (A7) 

where K has unit of inverse length and is chosen for a con- 
venient amplitude level. Applying the definition in Eq. (A7), 
the value of a/_/+i(m) will be 



a/_.+ ,(m) = A'V/^ + /0 sin i +£>^/4 

= KyJl^-lD sin ^, + DV4. (A8) 

where the relationship sin | = - sin 0^ can be obtained 
by substituting /-/ + 1 into i in Eq. (3). By substituting Eqs. 
(A7) and (A8) into Eq. (A6), one can verify that the values 
assigned to in Eq. (A7) satisfy the condition in Eq. 

(A6). 



APPENDIX B: DEFINITION OF THE INTELLIGIBIUTY- 
WEIGHTED MEASURE 

For any signal s, the intelligibility-weighted measure 
r(j) is calculated by (Link and Buckley, 1993) 



r(j)= f Wai(/)20 logio Tmstd\S(f)\)df, 
Jbw 

l/[l + (//1925)2] 



where 



W/[l + (//1925)2]d/' 



nns,o(|5(/)|) = 



Sl-Lj\S{f')\'df 



(Bl) 
(B2) 

(B3) 



^2 1/6— 2 ~ 

and BW denotes the frequency range. The system improve- 
ment in the intelligibility-weighted measure for the target 
signal T and interference / are, respectively, 

Ar(r)-r(r,)-r(r,) (b4) 

and 

Ar(/)=r(A)-r(/,). (b5) 
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where the subscripts i and o denote the input and output, 
respectively. The overall (or net) intelligibility-weighted 
gain, G/ . is the sum of the two measures, thus 



G/=Ar(r)+Ar(/). 



(B6) 



In our experiment, since we had separate recordings of 
the target and noise signals, i.e., and (the latter might 
include more than one interfering talker), we were able to 
apply the same processing frame wise on either of them and 
obtain the results and Z^,, respectively. The framewise 
processing [Eq. (11)], however, was determined based on the 
target and noise signals mixed together as would be encoun- 
tered in the real situation. It is noted that the spectrum S(f) 
was computed based on the full-length signal, which in our 
case was a whole spondaic word. We used the full long-term 
spectrum, as opposed to a frame-by-frame spectrum, for two 
reasons: (1) It was consistent with the way the intelligibility- 
weighted measure was applied in other papers published in 
the area such that our results could be compared directly 
with earlier results; (2) Since our system has virtually no 
adaptation time (i.e., it almost always is successful in local- 
izing the strongest interference within a few milliseconds), 
there is no advantage to computing with the short-term spec- 
tra. 

Since the intelligibility-weighted measure was con- 
structed as an estimate of the subjective improvement based 
on the objective calculation, it deliberately emphasized the 
low frequency domain according to the "critical-band" 
theory. However, because the low frequency domain is al- 
ways the hardest to clean with the approaches using multi- 
microphone arrays of limited size, the intelligibility- 
weighted measure usually has a smaller value (^1 dB 
difference in our experiment) than the non-weighting coun- 
terpart, i.e., SNR. Nonetheless, this effect does not change 
the overall picture of the performance; especially the com- 
parison of our system with others, as given in the paper, 
remains valid. 

Similarly, if we denote the array beampattem as £(/, 0), 
where / is frequency and 6 is the incident direction, the 
intelligibility-weighted beampattem. E(0), is defined by 
(Liu and Sideman, 1996) 



Eie)=\ w^if)E(f,e)df. 

Jbw 



(B7) 
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